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NUCLEIC ACID SEQUENCES AND EXPRESSION SYSTEMS FOR 
HEPARINASE II AND HEPARINASE III 
DERIVED FROM F lav o bacterium heparinum 

5 

BACKGROUND OF THE INVENTION 

This invention is directed to cloning, sequencing and expressing 
heparinase II and -heparinase ffir from fHavobavterhtm heparinum. 

The heparin and heparan sulfate family of molecules is comprised of 

1 0 glycosaminoglycans of repeating glucosamine and hexuronic acid residues, 
either iduronic or glucuronic, in which the 2, 3 or 6 position of glucosamine 
or the 2 position of the hexuronic acid may be sulfated. Variations in the 
extent and location of sulfation as well as conformation of the alternating 
hexuronic acid residue leads to a high degree of heterogeneity of the 

15 molecules within this family. Conventionally, heparin refers to molecules 
which possess a high sulfate content, 2.6 sulfates per disaccharide, and a 
higher amount of iduronic acid. Conversely, heparan sulfate contains lower 
amounts of sulfate, 0.7 to 1.3 sulfates per disaccharide, and less iduronic 
acid. However, variants of intermediate composition exist and heparins 

20 from all biological sources have not yet been characterized. 

Specific sulfation/glycosylation patterns of heparin have been 
associated with biological function, such as the antithrombin binding site 
described by Choay et al., Thrombosis Res. 18: 573-578 (1980), and the 
fibroblast growth factor binding site described by Turnbull et al., J. Biol. 

25 Chem.267: 10337-10341 (1992). It is apparent from these examples that 
heparin's interaction with certain molecules results from the conformation 
imparted by specific sequences and hot solely due to electrostatic 
interactions imparted by its high sulfate cpmposition. Heparin interacts 
with a variety of mammalian molecules, thereby modulating several 
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biological events such as hemostasis, cell proliferation, migration and 
adhesion as summarized by Kjellen and Lindahl, Ann Rev Biochem 60: 
443-475 (1991) and Burgess and Macaig, Ann. Rev. Biochem.58: 575-606 
___ (19g - 9) - extracted from bovine lungs and porcine intestines, has 

5 been used as an anticoagulant since its antithrombotic properties were 
discovered by McLean, Am. J. Physiol. 41: 250-257 (1916). Heparin and 
chemically modified heparins are continually under review for medical 
applications in the areas of wound healing and treating vascular disease. 
Heparin degrading enzymes, referred to as heparinases or heparin 
10 lyases, have been identified in several microorganisms including: 

Flavobacterium heparinum, Bacteriodes sp. and Aspergillus nidulans as 
summarized by Linhardt et al., Appl. Biochem. Biotechnol. 12: 135-177 
(1986). Heparan sulfate degrading enzymes, referred to as heparitinases 
or heparan sulfate lyases, have been detected in platelets (Oldberg et al 
1 5 Biochemistry 19: 5755-5762 (1980)), tumor (Nakajima et al, J. Biol Chem 
259: 2283-2290 (1984)) and endothelial cells (Gaal et al, Biochem. 
Biophys. Res. Comm. 161: 604-614 (1989)).. Mammalian heparanases 
catalyze the hydrolysis of the carbohydrate backbone of heparan sulfate at 
the hexuronic acid (1 - 4) glucosamine linkage (Nakajima et al,J. Cell 
20 Biochem. 36: 157-167 (1988)) and are inhibited by the highly sulfated 
heparin. However, accurate biochemical characterizations of these 
enzymes has thus far been prevented by the lack of a method to obtain 
homogeneous preparations of the molecules. 

Flavobacterium heparinum produces heparin and heparan sulfate 
25 degrading enzymes termed heparinase I (E.C. 4.2.2.7) as described by Yang 
et al, J. Biol Chem. 260(3): 1849-1857 (1985), heparinase II as described 
by Zimmermann and Cooney, U.S. Patent No. 5,169,772, and heparinase III 
(E.C. 4.2.2.8) as described by Lohse and Linhardt, J. Biol. Chem. 267: 24347- 
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24355 (1992). These enzymes catalyze an eliminative cleavage of the 
(al -> 4) carbohydrate bond between glucosamine and hexuronic acid 
residues in the heparin/heparan sulfate backbone. The three enzyme 
variants differ in their action on specific carbohydrate residues. 
Heparinase I cleaves at a-D-GlcN P 2S6S(l -> 4)a-L-IdoAp2S, heparinase 
in at a-D-GlcNp2Ac(or2S)60H(l -> 4)P-D-GlcAp and heparinase II at 
either linkage as described by Desai et at., Arch. Biochem. Biophys. 
306(2): 461-468 (1993). Secondary cleavage sites for each enzyme also 
have been described by Desai et al. 

Heparinase I has been used clinically to neutralize the 
anticoagulant properties of heparin as summarized by Baugh and 
Zimmermann, Perfusion Rev. 1(2): 8-13, 1993. Heparinase I and HI 
have been shown to modulate cell-growth factor interactions as 
demonstrated by Bashkin et al., /. Cell Physiol. 257:126-137 (1992) and 
cell-lipoprotein interactions as demonstrated by Chappell et al., J. Biol. 
Chem. 2<W9j :14 i68-14175 (1993). The availability of heparin 
degrading enzymes of sufficient purity and quantity could lead to the 
development of important diagnostic and therapeutic formulations. 

SUMMARY OF THE INVENTION 
Prior to the present invention/ partially purified heparinases II 

and III were available, but their amino acid sequences were unknown. 

Cloning these enzymes was difficult because of toxicity to the host cells. 

The present inventors were able to clone the genes for heparinases II 

and III, and herein provide their nucleotide and amino acid sequences. 
A method is described for the isolation of highly purified heparin 

and heparan sulfate degrading, enzymes from F. heparinum. 
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of each protein demonstrated that heparinases I, II and 
:dns. All three proteins are modified at their N- 
acid residue. Antibodies generated by injecting purified 

rabbits yielded anti-sera which demonstrated a high 
reactivity to proteins from F. heparinum. Polyclonal 

separated by affinity chromatography into fractions 
imino acid portion of the proteins and a fraction which 
ranslational modification allowing for the use of these 
scifically distinguish the native and recombinant forms 
ise protein. 

i sequence information was used to synthesize 
that were subsequently used in a polymerase chain 
3 amplify a portion of the heparinase II and heparinase 
lifted regions were used in an attempt to identify clones 

gene library which contained F. heparinum genomic 
election against clones containing the entire heparinase 
was observed. This was circumvented by cloning 
sparinase n gene separately, and by screening host 

maintenance of complete heparinase III clones. 
)arinase II and HI was achieved by use of a vector 
ified ribosome binding site which was shown to 
ission of heparinase I to significant levels. 

describes the gene and amino acid sequences for 
HI from F. heparinum, which may be used in 
suitable expression systems to produce the enzymes. 
" a modified ribosome binding sequence used to 
- I. II, and III. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows the modifications to the tac promoter ribosome 
binding region, which were evaluated for the level of expression of 
heparinase I. The original sequence, as found in pBhep, and the modified 
5 sequences, as found in pGhep and pA4hep, are shown with the Shine- 

Ualgarno sequences- (S-B) and -the- heparinase I gene -start -eedea,- 

underlined. The gap (in nucleotides, nt) between these regions is indicated 
below each sequence. The ribosome binding region for.pGB contains no 
start codon, and has a BamYO. site (underlined) in place of the EcoRl site 

1 0 (GAATTC) found in pGhep. 

Figure 2 shows the construction of plasmids used to sequence the 
heparinase II gene from Flavobacterium heparinum. Restriction sites are: 
N- Notl, Nc = Ncol, S = Sail, B = BamHl P = Pstl E « EcoRl H = HindUl, C « 
Clal and K = Kpnl, 

15 Figure 3 shows the construction of pGBH2, a plasmid capable of 

directing the expression of active heparinase n in E. coli from tandem tac 
promoters (double arrow heads). Restriction sites are: B = BamHl, P « Pst I. 

Figure 4 shows the nucleic acid sequence for the heparinase II gene 
from Flavobacterium heparinum (SEQU ID NO:l). 

2 0 Figure 5 shows the amino acid sequence for heparinase II from 

Flavobacterium heparinum (SEQU ID NO:2). The leader peptide sequence is 
underlined. The mature protein starts at Q-26. Peptides 2A, 2B and 2C are 
indicated at their corresponding positions within the protein. 

Figure 6 shows the construction of plasmids used to sequence the 
IS heparinase III gene from Flavobacterium heparinum. Restriction sites are: 
S = Sail, B = BamHl, P = Pstl E = EcoRl, H = Hindm, C = Clal and *. = Kpnl. 
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Figure 7 shows the construction of pGBH3, a plasmid capable of 
directing the expression of active heparinase III in E. coli from a tandem 
tag promoter (double arrow heads). Restriction sites are: S = San, B = 
BamHl P = Pstl E = EcoRl, H = HindUI, Bs = BspEl, C = Clal and K = KpnL 

Fi « ure 8 shows lhe nucleic acid sequence for the heparinase III gene 
from Flavobacterium heparinum (SEQU ID NO:3). 

Figure 9 shows the amino acid sequence for heparinase III from 
Flavobacterium heparinum (SEQU ID NO:4). The leader peptide sequence is 
underlined. The mature protein starts at Q-25. Peptides 3A, 3B and 3C are 
indicated at their corresponding positions within the protein. 



DETAILED DESCRIPTION OF THE INVENTION 
To aid in the understanding of the specification and claims, including 
the scope to be given such terms, the following definitions are provided. 
15 By the term "gene" is intended a DNA sequence which encodes 

through its template or messenger RNA a sequence of amino acids 
characteristic of a specific peptide. Further, the term includes intervening, 
non-coding regions, as well as regulatory regions, and can include 5' and 3' 
ends. 

20 Sequence ,. The term "gene sequence" is intended to refer 

generally to a DNA molecule which contains one or more genes, or gene 
fragments, as well as a DNA molecule which contains a non-transcribed or 
non-translated sequence. The term is further intended to include any 
combination of gene(s), gene fragments(s), nonrtranscribed sequence(s) or 
25 non-translated sequence(s) which are present on the same DNA molecule. 
The present sequences may be derived from a variety of sources 
including DNA, synthetic DNA. RNA, or combinations thereof. Such gene 
sequences may comprise genomic DNA which may or may not include 
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naturally occurring introns. moreover, such genomic DNA may be obtained 
in association with promoter regions or poly A sequences. The gene 
sequences, genomic DNA or cDNA may be obtained in any of several ways. 
Genomic DNA can be extracted and purified from suitable cells, such as 
5 brain cells, by means well known in the art. Alternatively, mRNA can be 
isolated from a cell and used to produce cDNA by reverse transcription or 
other means. 

Recombinant DNA . By the term "recombinant DNA" is meant a 
molecule that has been recombined by in vitro splicing cDNA or a genomic 
1 0 DNA sequence. 

Qpqfrg Vehicle , A plasmid or phage DNA or other DNA sequence 
which is able to repUcate in a host cell. The cloning vehicle is characterized 
by one or more endonuclease recognition sites at which is DNA sequences 
may be cut in a determinable fashion without loss of an essential biological 
1 5 function of the DNA, which may contain marker suitable for use in the 
identification of transformed cells. Markers include for example, 
tetracycline resistance or ampicillin resistance. The word y^'can be 
used to connote a cloning vehicle. 

E xpression Control ^ m ^ m . A sequence of nucleotides that controls 
or regulates expression of structural genes when operably linked to those 
genes. They include the lac systems, the trp system major operator and 
promoter regions of the phage lambda, the control region of fd coat protein 
and other sequences known to control the ] expression of genes in 
prokaryotic or eukaryotic cells. 
2 5 Expression vehicfe , A vehicle or vector similar to a cloning vehicle 

but which is capable of expressing a gene which has been cloned into it, 
after transformation into a host. The. cloned gene is usually placed under 
the control of (i.e., operable linked to) certain control sequences such as 
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promoter sequences. Expression control sequences will vary depending on 
whether the vector is designed to express the operably linked gene in a 
prokaryotic or eukaryotic host and may additionally contain 
transcriptional elements such as enhancer elements, termination 
5 sequences, tissue-specificity elements, and/or translational initiation and 
termination sites. 

prompter . The term "promoter" is intended to refer to a DNA 
sequence which can be recognized by an RNA polymerase. The presence of 
such a sequence permits the RNA polymerase to bind and initiate 
10 transcription of operably linked gene sequences. 

Piomoter region . The term "promoter region" is intended to broadly 
include both the promoter sequence as well as gene sequences which may 
be necessary for the initiation of transcription. The presence of a promoter 
region is, therefore, sufficient to cause the expression of an operably linked 
5 gene sequence. 

Operably linked . As used herein, the term "operably linked" means 
that the promoter controls the initiation of expression of the gene. A 
promoter is operably linked to a sequence of proximal DNA if upon 
introduction into a host cell the promoter determines the transcription of 
0 the proximal DNA sequence or sequences into one or more species of RNA. 
A promoter is operably linked to a DNA sequence if the promoter is 
capable if initiating transcription of that DNA sequence. 

B roMryote .. The term "prokaryote" is meant to include all organisms 
without a true nucleus, including bacteria. 
5 Hasi. The term "host" is meant to include not only prokaryotes, but 

also such eukaryotes as yeast and filamentous fungi, as well as plant and 
animal cells. The terms includes organisms or cell that is the recipient of a 
replicable expression vehicle. 



9 

The present invention is based on the cloning and expression of two 
previously uncloned enzymes. Although heparinases II and III had been 
partially purified previously, no amino acid sequences were available. 
Specifically, the invention discloses the cloning, sequencing and expression 
5 of heparinases II and III from Flavobacterium heparinum and the use of a 

■modified' ribosome binding region for expression of - these~"genesr In 

addition to the nucleotide sequences, the amino acid sequences of 
heparinases II and II are also provided. The invention further provides 
expressed heparinases I, II and III, as well as methods of expressing those 
10 enzymes. 

Cloning was accomplished using degenerate and "guessmer" 
nucleotide primers derived from: amino acid sequences of fragments of the 
heparinases, purified as' described below in detail. The amino acid 
sequences were previously unavailable. Cloning was exceptionally difficult 

1 5 because of the unexpected problem of F. heparinum DNA toxicity in E. coli. 
The inventors discovered techniques for solving this problem, as described 
below in detail: Based' on this disclosure, one skilled in the art can readily 
clone additional heparinases and other proteins from F. heparinum or from 
additional sources using the novel methods described within. 

20 Expression of the heparinases is a further disclosure of the present 

invention. To express heparinases I, II and III, transcriptional and 
translational signals recognizable by an appropriate host are necessary. 
The cloned heparinases encoding sequences, obtained through the methods 
described above, and preferably in a double-stranded form, may be 

25 operably linked to sequences controlling transcriptional expression in an 
expression vector, and introduced into a host cell, either prokaryote or 
eukaryote, to produce recombinant heparinases or a functional derivative 
thereof. Depending upon which strand of the heparinases encoding 
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sequence is operably linked to the sequences controlling transcriptional 
expression, it is also possible to express heparinases antisense RNA or a 
functional derivative thereof. 

For- the expression 4f Jieparinases I, If and Iff in E. coli. vectors were 
5 constructed wherein expression was driven by two repeats of the tac 
promoter. Modifications of the ribosome binding region of this promoter 
were made by introducing mutations with the polymerase chain reaction. 
In a preferred modification of the expression vector, the minimal 
consensus Shine-Delgarno sequence was improved by introducing a single 
10 mutation (AGGAA -> AGGAG), which had the further advantage of 
decreasing the number of nucleotides between the Shine-Delgarno 
sequence and the ATG start codon. Further modifications were produced 
using PGR in which the gap between the Shine-Delgarno sequence and the 
start codon were further reduced. Using the same techniques, additional 
15 modifications in this region, including insertions and deletions, can be 
produced to create additional heparinase expression vectors. As a result, 
an expression vector for the expression of heparinases is provided which 
comprises a modified ribosome binding region containing a 5 base pair 
Shine-Dalgarno sequence, a 9 base pair spacer region between the Shine- 
2 0 Dalgarno sequence and the ATG start codon, and a recombinant nucleotide 
sequence encoding. Also provided are modifications to this vector 
comprising changing the length and sequence of the Shine-Dalgarno 
sequence, and also by reducing the spacing between the Shine-Dalgarno 
sequence and the start codon to 8, 7, 6, 5, 4 or fewer nucleotides. Methods 
2 5 of expressing the heparinases using these novel expression vectors 
comprise a preferred embodiment of the invention. 

Expression of the heparinases in different hosts may result in 
different post-translational modifications which may alter the properties 
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of the heparinases, or a functional derivative thereof, in eukaryotic cells, 
and especially mammalian, insect and yeast ceils, Especially preferred 
eukaryotic hosts are mammalian cells either in vivo, in animals or in tissue 
culture. Mammalian cells provide post-translational modifications to 
5 recombinant heparinases which include folding and/or glycosylate at 
sites-sifliikr-or- identical- ta that found for the native Tiepanhases. Most 
preferably, mammalian host cells include brain and neuroblastoma cells. 

A nucleic acid molecule, such as DNA, is said to be "capable of 
expressing" a polypeptide if it contains expression control sequences which 
10 contain transcriptional regulatory information and such sequences are 
"operably linked" to the nucleotide sequence which encodes the 
polypeptide. 

An operable linkage is a linkage in which a sequence is connected to 
a regulatory sequence (or sequences) in such a way as to place expression 
15 of the sequence under the influence or control of the regulatory sequence. 
Two DNA sequences (such as a heparinases encoding sequence and a 
promoter region sequence linked to the 5' end of the encoding sequence) 
are said to be operably linked if induction of promoter function results in 
the transcription of the heparinases encoding sequence mRNA and if the 
20 nature of the linkage between the two DNA sequences does not (1) result 
in the introduction of a frame-shift mutation, (2) interfere with the ability 
of the expression regulatory sequences to direct the expression of the 
heparinases, or (3) interfere with the ability of the heparinases template to 
be transcribed by the promoter region sequence. Thus, a promoter region 
25 would be operably linked to a DNA sequence if the promoter were capable 
of effecting transcription of that DNA sequence. 

The precise nature of the regulatory regions needed for gene 
expression may vary between species or cell types, but in general includes, 
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as necessary, 5' non-transcribing and 5' non-translating (non-coding) 
sequences involved with initiation of transcription and translation 
respectively, such as the TATA box, capping sequence, CAAT sequence, and 
lhe ^Especially, such 5' non-transcribing control sequences will include 
5 a region which contains a promoter for transcriptional control of the 
operably linked gene. 

If desired, a fusion product of the heparinases may be constructed. 
For example, the sequence coding for heparinases may be linked to a signal 
sequence which will allow secretion of the protein from, or the 
10 compartmentalization of the protein in, a particular host. Such signal 
sequences maybe designed with or without specific protease sites such 
that the signal peptide sequence is amenable to subsequent removal. 
Alternatively, the native signal sequence for this protein may be used. 

Transcriptional initiation regulatory signals can be selected which 
allow for repression or activation, so that expression of the operably linked 
genes can be modulated. 

Based on this disclosure, one skilled in the art can readily place the 
sequences of the present invention in additional expression vectors and 
transform into a variety of bacteria to obtain recombinant heparinase II or 
20 heparinase III. 

Once the vector or DNA sequence containing the construes) is 
prepared for expression, the DNA construct(s) is introduced into an 
appropriate host cell by any if . varie ,y 0 f suitable means, including 
transfection. After the introduction of the vector, recipient cells are grown 
in a selective medium, which selects for the growth of vector-containing 
cells. Expression of the cloned gene Se quence(s) results in the production 
of heparinase I, n or in. or in the production of a fragment of one of these 
proteins. This expression can take place in a continuous manner in the 
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transformed cells, or in a controlled manner, for example, expression which 
follows induction of differentiation of the transformed cells (for example, 
by administration of bromodeoxyuracil to neuroblastoma cells or the like). 

The expressed protein is isolated and purified in accordance with 
5 conventional conditions, such as extraction, precipitation, chromatography, 
eleetrophoresisr or the-Hkfc -Detailed proeednres-for thr nsotetion of the 
heparinases is discussed in detail in the examples below. 

The invention further provides functional derivatives of the 
sequences of heparinase II, heparinase III, and the modified ribosome 

1 0 binding site. As used herein, the term "functional derivative'' is used to 

define any DNA sequence which is derived by the original DNA sequence 
and which still possesses the biological activities of the native parent 
molecule. A functional' derivative can be an insertion, a deletion, or a 
substitution of one or more bases in the original DNA sequence. The 
15 substitutions can be such that they replace a native amino acid with 

another amino acid that does not substantially effect the functioning of the 
protein. Those skilled ' in the art will recognize that likely substitutions 
include positively the functioning of the protein, such as a small, neutrally 
charged amino acid replacing another small, neutrally charged amino acid. 

2 0 Those of skill in the art will recognize that functional derivatives of the 

heparinases can be prepared by mutagenesis of the DNA using one of the 
procedures known in the art, such as site-directed mutagenesis. In 
addition, random mutagenesis can be conducted and mutants retaining 
function can be obtained through appropriate screening. 
2 5 The antibodies of the present invention include monoclonal and 

polyclonal antibodies, as well fragments of . these antibodies. Fragments of 
the antibodies of the present invention include, but are not limited to, the 
Fab, the Fab2, and the Fc fragment. 
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The invention also provides hybridomas which are capable of 
producing the above-described antibodies. A hybridoma is an 
immortalized cell line which is capable of secreting a specific monoclonal 
antibody. 

"3 In general, techniques for preparing polyclonal and monoclonal 

antibodies as well as hybridomas capable of producing the desired 
antibody are well-known in the art (Campbell, A.M., "Monoclonal Antibody 
Technology: Laboratory Techniques in Biochemistry and Molecular 
Biology," Elsevier Science Publishers, Amsterdam, The Netherlands (1984); 
1 0 St. Groth et al., /. Immunol. Methods 35:1-21 (1980)). 

Any mammal which is known to produce antibodies can be 
immunized with the pseudogene polypeptide. Methods for immunizaUon 
are well-known in .he art. Such methods include subcutaneous or 
interperitoneal injection of the polypeptide. One skilled in the an will 
recognize mat the amount of hepatinase used for immunization will vary 
based on the animal which is immunized, the antigenicity of the peptide 
and the site of injection. 

The protein which is used as an immunogen may be modified or 
administered in an adjuvant in order to increase the protein's antigenicity 
Methods of increasing the antigenicity of a protein are well-known in the 
art and include, but are not limited to coupling the antigen with a 
heterologous protein (such as globulin or p-galactosidase) or through the 
inclusion of an adjuvant during immunization. 

For monoclonal antibodies, spleen cells from the immunized animals 
are removed, fused with myeloma cells, such as SP2/0-Agl4 myeloma 
cells, and allowed to become monoclonal antibody producing hybridoma 
cells. 
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Any one of a number of methods well known in the art can be used 
to identify the hybridoma cell which produces an antibody with the 
desired characteristics. These include screening the hybridomas with an 
ELISA assay, western blot analysis, or radioimmunoassay (Lutz et al., Exp. 
Cell Res. 775:109-124 (1988)). 

'"HybU'domas secreting - tRe dTelired aimbodies are cloned arid the class 
and subclass is determined using procedures known in the art (Campbell, 
A.M., Monoclonal Antibody Technology: laboratory Techniques in 
Biochemistry and Molecular Biology, -Elsevier Science Publishers, 
Amsterdam, The Netherlands (1984)). 

For polyclonal antibodies, antibody containing antisera is isolated 
from the immunized animal and is screened for the presence of antibodies 
with the desired specificity using one of the above-described procedures. 

The present invention further provides the above-described 
antibodies in detectably labelled form. Antibodies can be detectably 
labelled , through the use of radioisotopes, affinity labels (such as biotin, 
avidin, etc.), enzymatic labels (such as horseradish peroxidase, alkaline 
phosphatase, etc.), fluorescent labels (such as FITC or rhodamine, etc.), 
paramagnetic atoms, chemiluminescem labels, and the like. Procedures for 
accomplishing such labelling are well-known in the art; for example, see 
Sternberger, L.A. et al.. J. Histochem. Cytochem. 75:315 (1970); Byer, E.A. et 
al.,Meth. Enzym. 62:308 (1979); Engval, E. et al, Immunol. 109:129 (1972); 
Goding, J.W., J. Immunol. Meth. 13:2X5 (1976). 

The present invention further provides the above-described 
antibodies immobilized on a solid support. Examples of such solid supports 
include plastics, such as polycarbonate, complex carbohydrates such as 
agarose and sepharose, acrylic resins such as polyacrylamide and latex 
beads. Techniques for coupling antibodies to such solid supports are well 
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known in the an (Weir et al. Handbook of Experimental Immunology, 4th 
Ed., Blackwell Scientific Publications, Oxford, England (1986)). The 
immobilized antibodies of the present invention can be used for 
immunoaffinity purification of heparinases. 

■H^IS" ii^^i^ * invention, the same will be 
understood by a series of specific examples, which are not intended to be 
limiting. 

EXAMPLE is Purification of Heparinases 
Heparin lyase enzymes were purified from cultures of 
Flavobacterium htparinum. F. heparinum was cultured in a 15 L 
computer-controlled fermenter using a variation of „e defined nutrient 
medium described by Galliher et al ., Ap pl Environ, Microbiol. 47^360- 
365 (1981). Those fermentations designed to produce heparin lyases 
15 ..corporate semi-purified heparin (Gelsus Laboratories) in the media a. a 
concentration of 1.0 g/L as the inducer of heparin.se synthesis. Cells were 
harvested by centrifugal and the desired enzymes released from the 
periplasms space by a variation of the osmotic shock procedure described 
by Zunmermann and Cooney. U.S. Patent No. 5,262,325, herein 
20 incorporated by reference. 

A semi-purified preparation of the heparinase enzymes was 
achieved by , modification of the procedure described by Zimmermann e, 
al. U.S. Paten, No. 5462,325. Pro,.i„ s bom me crude ^ 
adsorbed onto cation exchange resin (CBX. J.T. Baker) a. a conductivity of 1 
-7 umbo. Unbound proteins from the extract were discarded and the 
restn packed into a chromatography column (5.0 cm i.d. x 100 cm) The 
bound proteins eluted at a Hnear flow rate of 3.75 cm-min-t with step 
Stents of 0.01 M phosphate. 0.0, M phosphate/0.1 M sodium chloride, 
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0.01 M phosphate/0.25 M sodium chloride and 0.01 M phosphate/1.0 M 
sodium chloride, all at pH 7.0 +/- 0.1. Heparinase II elutes in the 0;1 M 
NaCl fraction, while heparinases 1 and 3 elute in the 0.25 M fraction. 

Alternately, the 0.1 M sodium chloride step was eliminated and the 
three heparinases co-eluted with 0.25 M sodium chloride. The heparinase 
fractions were loaded directly onto a column containing cellufine sulfate 
(5.0 cm i.d. x 30 cm, Amicon) and eluted at a linear flow rate of 2.50 
cnrmin-1 with step gradients of 0.01 M phosphate, 0.01 M phosphate/0.2 
M sodium chloride, 0.01 M phosphate/0.4 M sodium chloride and 0.01 M 
phosphate/1.0 M sodium chloride, all at pH 7.0 +/- 0.1. Heparinase II and 
3 elute in the 0.2 M sodium chloride fraction while heparinase I elutes in 
the 0.4 M fraction. 

The 0.2 M sodium chloride fraction from the cellufine sulfate column 
was diluted with 0.01 M sodium phosphate to give a conductance of less 
than 5 umhos. The solution was further purified by loading the material 
onto a hydroxylapatite column (2.6 cm i.d. x 20 cm) and eluting the bound 
protein at a linear flow rate of 1.0 cmunin-1 with step gradients of 0.01 M 
phosphate, 0.01 M phosphate/0.35 M sodium chloride, 0.01 M 
phosphate/0.45 M sodium chloride, 0.01 M phosphate/0.65 M sodium 
chloride and 0.01 M phosphate/1.0 M sodium chloride, all at pH 7.0 +/- 0.1. 
Heparinase m elutes in a single protein peak in the 0.45 M sodium 
chloride fraction while heparinase III elutes in a single protein peak in the 
0.65 M sodium chloride fraction. 

Heparinase I was further purified by loading material from the 
cellufine sulfate column, diluted to a conductivity less than 5 umhos, onto 
a hydroxylapatite column (2.6 cm i.d. x 20 cm) and eluting the bound 
protein at a linear flow rate of 1.0 cm*min-l with a linear gradient of 
phosphate (0.01 to 0.25 M) and sodium^ chloride (0.0 to 0.5 M). Heparinase 
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I elutes in a single protein peak approximately mid-way through the 
gradient. 

The heparinase enzymes obtained by this method were analyzed by 
SDS-PAGE using the technique of Laemmli, Nature 227: 680-685 (1970), 
and the gels quantified by a scanning densitometer (Bio-Rad, Model GS- 
670). Heparinases I, II and III displayed molecular weights of 42,500+/- 
2,000, 84.000+/-4.200 and 73.000+/-3.500 Daltons, respectively. All 
proteins displayed purities of greater than 99 %. Purification results for 
the heparinase enzymes are shown in Table 1. 

Heparinase activities were determined by the spectrophotometric 
assay described by Yang et al. A modification of this assay incorporating a 
reaction buffer comprised of 0.018 M Tris, 0.044 M sodium chloride and 
1-5 g/L heparan sulfate at pH 7.5 was used to measure heparan sulfate 
degrading activity. 

15 Recombinant heparinase I forms intracellular inclusion bodies which 

require denaturation and protein refolding to obtain active heparinase. 
Two solvents, urea and guanidine hydrochloride, were examined as 
solubilizing agents. Of these, only guanidine HC1, at 6 M, was able to 
solubilize the heparinase 1 inclusion bodies. However, the highest degree 

20 of purification was obtained by sequentially washing the inclusion bodies 
in 3 M urea and 6 M guanidine HC1. The urea wash step served to removed 
contaminating £. coli proteins and cell debris prior to solubilizing of the 
aggregated heparinase I by guanidine HC1. 

Recombinant heparinase I was prepared by growing E. coli 
25 Y1090(pGHepl), a strain harboring a plasmid containing the heparinase I 
gene expressed from tandem tac promoters, in Luria broth with 0.1 M 
IPTG. The cells were concentrated by centrifugation and resuspended in 
l/10th volume buffer containing 0.01 M sodium phosphate and 0.2 M 
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sodium chloride at pH 7.0. The cells were disrupted by sonication, 5 
minutes with intermittent 30 second cycles, power setting #3 and the 
inclusion bodies concentrated by centrifugation, 7,000 x g, 5 minutes. The 
pellets were washed two times with cold 3 M urea for 2 hours at pH, 7.0 
and the insoluble material recovered by centrifugation. Heparinase I was 
ImToIdetf In 6 M "guanidine" HCI containing 55 rnM DTT and refolded by 
dialysis into 0.1 M ammonium sulfate. Additional contaminating proteins 
precipitated in the 0.1 M ammonium sulfate and could be removed by 
centrifugation. Heparinase I purified by this method had a specific activity 
of 42.21 IU/mg and was 90 % pure by SDS-PAGE/ scanning densitometry 
analysis. The enzyme can be further purified by cation exchange 
chromatography, as described above, yielding a heparinase I preparation 
that is more than 99 % pure by SDS-PAGE/ scanning densitometry analysis. 

EXAMPLE 2: Characterization of Heparinases 
The molecular weight and kinetic properties of the three heparinase 
enzymes have been accurately reported by Lohse and Linhardt, J. Biol. 
Chem. 267:24347-24355 (1992). However, an accurate characterization of 
the proteins' post-translational modifications had not been carried out. 
Heparinases I, II and in, purified as described herein, were analyzed for 
the presence of carbohydrate moieties. Solutions containing 2 ug of 
heparinases I, II and III and recombinant heparinase I were brought to pH 
5.7 by adding 0.2 M sodium acetate. These protein samples underwent 
carbohydrate biotinylation following protocol 2a, described in the 
GlycoTrack kit (Oxford Glycpsystems). 30 ul of each biotinylated protein 
solution was subjected to SDS-PAGE (10% gel) and transferred by 
electroblotting at 170 mA constant current to a nitrocellulose membrane. 
Detection of the biotinylated carbohydrate was accomplished by an 
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alkaline phosphatase-specific color reaction after attachment of a 
streptavadin-alkaline phosphatase conjugate to the biotin groups. These 
analyses revealed that heparinases I and II are glycosylated and 
heparinase III and recombinant heparinase I are not. 

Polyclonal antibodies generated in rabbits injected with wild type 
heparinase I could be fractionated into two populations as described 
below. It appears that one of these fractions recognizes a post- 
translational moiety common to proteins made in F. heparinum, while the 
other fraction specifically recognizes amino acid sequences contained in 
heparinase I. All heparinase enzymes made in F. heparinum were 
recognized by the "non-specific" antibodies hut not heparinase made in E. 
coli. The most likely candidate for the non-protein antigenic determinant 
from heparinase I is the carbohydrate component; thus, the Western blot 
experiment indicates that all lyases made in F. heparinum are glycosylated. 

Purified heparinases II and III were analyzed by the technique of 
Edman to determine the N-terminal amino acid residue of the mature 
protein. However/the Edman chemistry was unable to liberate an amino 
acid, indicating that a post-translational modification had occurred at the 
N-terminal amino acid of both heparinases. One nmol samples of 
0 heparinases II and III were used for deblocking with pyroglutamate 
aminopeptidase. Control samples- were produced by mock deblocking 1 
nmol protein samples without adding pyroglutamate aminopeptidase. All 
samples were placed in 10 mM NH4CO3, P H 7.5, and 10 mM DTT (100 ul 
final volume). To non-control samples. I mU of pyroglutamate 
5 aminopeptidase was added and all samples were incubated for 8 hr at 37° 
C. After incubation, an additional 0.5 mU of pyroglutamate 
aminopeptidase was added to non-control samples and all samples were 
incubated for an additional 16 h at 37°C. 
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Deblocking buffers were exchanged for 35% formic acid using a 
10,000 Dalton cut-off Centricon unit and the sample was dried under 
vacuum. The samples were subjected to amino acid sequence analysis 
according to the method of Edman. 
5 The properties of the three heparinase proteins from Flavobacterium 

hepammm -are-listed ir* Tabled. 

Heparinases II and III were digested with cyanogen bromide in 
order to produce peptide fragments for isolation. The protein solutions (1- 
10 mg/ml protein concentration) were brought to a DTT concentration of 
0 0.1 M, and incubated at 40°C for 2 hr. The samples were frozen and 

lyophilized under vacuum. The pellet was resuspended in 70% formic acid, 
and nitrogen gas was bubbled through the solution to exclude oxygen. A 
stock solution of CNBr was made in 70% formic acid and the stock solution 
was bubbled with nitrogen gas and stored in the dark for short time 
5 periods. For addition of CNBr, a 500 to 1000 times molar excess of CNBr to 
methionine residues in the protein was used. The CNBr stock was added to 
the protein solutions, bubbled with nitrogen gas and the tube was sealed. 
The reaction tube was incubated at 24°C for 20 hr, in the dark. 

The samples were dried down partially under vacuum, water was 
0 added to the sample, and partial lyophilization was repeated. This washing 
procedure was repeated until the sample pellets were white. The peptide 
mixtures were solubilized in formic acid and applied to a Vydac Ci8 
reverse phase HPLC column (4.6 mm i.d. x 30 cm) and individual peptide 
fragments eluted at a linear flow rate , of 6.0 cm-min-1 with a linear 
> gradient of 10 to 90 % acetonitrile in 1 % : trifluoroacetic acid. Fragments 
recovered from these reactions were subjected to amino acid sequence 
determination using an Applied Biosystems 745A Protein Sequencer. Three 
peptides isolated from heparinase II gave sequences: EFPEMYNLAAGR 
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(SEQU ID NO:5), KPADIPE VKDGR (SEQU ID NO:6), and LAGDFVTGKILAQGFG 
PDNQTPDYTYL (SEQU ID NO:7) and were named peptides 2A, 2B and 2C 
respectively. Three peptides from heparinase III gave sequences: LIK- 
NEVRWQLHRVK (SEQU ID NO:8), VLKASPPGEFHAQPDNGTFELFI (SEQU ID 
NO:9) and KALVHWFWPHKGYGYFDYGKDIN (SEQU ID NO:10) and were 
named peptides 3A, 3B and 3C, respectively. 

EXAMPLE 3: Antibodies to the Heparinase Proteins 
Heparinases I, II and III and recombinant heparinase I, purified as 
described herein, were used to generate polyclonal antibodies in rabbits. 
Each of heparinase I, II and III was carried through the following standard 
immunization procedure: The primary injection consisted of 0.5 - 1.0 mg of 
purified protein dissolved in 1 ml of sterile phosphate buffered Saline, 
which was homogenized with 1 ml of Freund's adjuvant (Cedarlane 
15 Laboratories Ltd.). This protein-adjuvant emulsion was used to inject New 
Zealand White female rabbits; 1 ml per rabbit, 0.5 ml per rear leg, i.m., in 
the thigh muscle near the hip. After 2 to 3 weeks, the rabbits were given 
an injection boost consisting of 0.5 - 1.0 mg of purified protein dissolved in 
sterile phosphate buffered Saline homogenized with 1 ml of incomplete 
20 Freund's adjuvant (Cedarlane Laboratories, Ltd.). Again after 2 to 3 weeks, 
the rabbits were given a third identical injection boost. 

A blood sample was collected from each animal from the central artery 
of the ear approximately 10 days following the final injection boost. 
Serum was prepared by allowing the sample to clot for 2 hours at 22°C 
25 followed by overnight incubation at 4°C. and clearing by centrifugation at 
5,000 rpm for 10 min. The antisera were diluted 1:100,000 in Tris- 
buffered Saline (pH 7.5) and carried through Western blot analysis to 
identify those sera containing anti-heparinase I, II or III antibodies. 
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Antibodies generated against wild type heparinase I, but not 
recombinant heparinase I, displayed a high degree of cross reactivity 
against other F. heparinum proteins. " This was likely due to the presence 
of an antigenic post-translational modification common to F. heparinum 
proteins but not found on proteins synthesized in E. coli. To explore this 
further, recombinant heparinase I was immobilized onto Sepharose beads 
and packed into a chromatography column. Purified anti-heparinase I 
(wild type) antibodies were loaded onto the column and the unbound 
fraction collected. Bound antibodies were eluted in 0.1 M glycine, pH 2.0. 
IgG was found in both the unbound and bound fractions and subsequently 
used in Western blot experiments. Antibody isolated from the unbound 
fraction non-specifically recognized F. heparinum proteins but no longer 
detected recombinant heparinase I (E. coli), while the antibody isolated 
from the bound fraction only recognized heparinase I, whether synthesized 
in F. heparinum or E. coli. This result indicated that as hypothesized, two 
populations of antibodies are formed by exposure to the wild-type 
heparinase I antigen: one specific for the protein backbone and the other 
recognizing a post-translationally modified moiety common to F. 
heparinum proteins. 

This finding provides both a means to purify specific anti-heparinase 
antibodies and a tool for characterizing the wild-type heparinase I protein. 

EXAMPLE 4: Construction of a F. heparinum Gene Library 
A Flavobacterium heparinum chromosomal DNA library was 
constructed in lambda phage DASHH. 0.4 ug of F. heparinum chromosomal 
DNA was partially digested with restriction enzyme SaulA to produce a 
majority of fragments around 20 kb in size,, as described, in Maniatis, et al, 
Molecular Cloning Manual, Cold Spring Harbor (1982). This DNA was 
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phenol/chloroform extracted, ethanol precipitated, ligated with XDASHII 
arms and packaged with packaging extracts from a XDASHll/BamHl 
Cloning Kit (Stratagene, La Jolia, CA). The library was titered at 
approximately, 10-5 pf u/m l after packaging, amplified to 10-8 pfu/ml by 
the plate lysis method, and stored at -70°C as described by Silhavy, T.J., et 
al.in Experiments in Molecular Genetics, Cold Spring Harbor Laboratory, 
1992. 

The F. heparinum chromosomal library was titered to about 300 
pfu/plate, overlaid on a lawn of £. coli : and allowed to transfect the cells 
overnight at 37°C, forming plaques. The phage plaques were transferred 
to nitrocellulose paper, and the phage DNA bound to the filters, as 
described in Maniatis, etaL.ibid. 

f^ExtLL n A f ¥? m l d RU ! OSOn,e Bindin * *>r 
the Expression of Flavobacterium heparinum 

Glycosaminoglycan Lyases 
The gene for the mature heparinase I protein was cloned into the 
EcoRl site of the vector, pB9, where its expression was driven by two 
repeats of the tac promoter (from expression vector, pKK223-3, Brosius, 
20 and Holy, Proc. Natl Acad. Sci. USA 81: 6929-6933 (1984)). In this vector, 
pBhep, the first codon, ATG, for heparinase 1 is separated by 10 
nucleotides from a minimal Shine -Dalgarno sequence AGGA (Shine and 
Dalgarno, Proc. Natl. Acad. Sci. USA 77:1342-1346 (1974)), Figure 1. This 
construct was transformed into the £. coli strain, JM109, grown at 37<> C 
25 and induced with ImM IPTG, 2 hours before harvesting. Cells were iysed 
by sonication, the cell membrane fraction was pelleted and the 
supernatant was saved. The membrane fraction was resuspended in 6M 
guanidine-HCl in order to soluble inclusion bodies containing the 
recombinant heparinase I enzyme. The soluble heparinase I was refolded 
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by diluting in 20mM phosphate buffer. The enzyme activity was 
determined in the refolded pellet fraction, and in the supernatant fraction. 
Low levels of activity were detected in the supernatant and the pellet 
fractions. Analysis of the fractions by SDS-PAGE indicated that both 
5 fractions may contain minor bands corresponding to the recombinant 
heparinase I. 

In an attempt to increase expression levels from pBhep, two 
mutations were introduced as indicated in Figure 1. The mutations were 
produced to improve the level of translation of the heparinase I mRNA by 
10 increasing the length of the Shine-Dalgarno sequence and by decreasing 
the distance between the Shine-Dalgarno sequence and the ATG-start site. 
Using PCR, a single base mutation converting an A to a G improved the 
Shine-Dalgarno sequence from a minimal AGGA sequence to AGGAG while 
decreasing the distance between the Shine-Dalgarno sequence and the 

15 translation start site from 10 to 9 base pairs. This construct was named 
pGhep. In the second construct, pA4hep, 4 nucleotides (AACA) were 
deleted using PGR, in order to lengthen the Shine-Dalgarno sequence to 
AGGAG as well as moving it to within 5 base pairs of the ATG-start site. 
The different constructs were analyzed as described above. Refolded 

20 pellets from E. coli transformed with pGhep displayed approximately a 7X 
increase in heparinase I activity, as compared to refolded pellets from E. 
coli containing pBhep. On the other hand, E. coli containing- pA4hep 
displayed 2-3 times less activity than the pBhep containing E. coli. The 
levels of heparinase 1 activity in the supernatants were similar. 

25 Plasmid, pBhep, was digested with EcoRl and treated with SI 

nuclease to form blunt-ended DNA. The plasmid DNA was then digested 
with Bam HI and the single-stranded ends i were made double-stranded by 
filling-in with Klenow fragment. The blunt-end DNA was ligated and 
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transformed into £. coli strain FTB1. A plasmid which contained a unique 
Bamm site and no heparinase I gene DNA was purified from a kanamycin 
resistant colony and was designated plasmid, pGB. DNA sequence analysis 
revealed that plasmid pGB contained the modified ribosome binding site, 
5 shown in Figure 1. 

EXAMPLE 6: Nucleic Acid Encoding Heparinase II 
Four "guessraer" oligonucleotides were designed using information 
from two peptide sequences 2A and 2B and use of the consensus codons 
1 0 for Flavobacterium, shown in Table 3. These were: 

S'-GAATTCCCrGAGATGTACAATCTGGCCGW (SEQU ID NO:l 1), 
5'-CCGGGAGCCAGATTGTACATriCAGG-3' (SEQU ID NO:12), 
5'-AAACCCGCCGACATTCCCGAAGTAAAAGA 3' (SEQU ID NO:13), and 
S'-CGAAAGTCTmAaTCGGGAATOTCGGC-S' (SEQU ID N0.14), 
15 named 2-1, 2-2, 2-3 and 2-4, respectively: The oligonucleotides were 
synthesized with a Bio/CAN (Mississauga, Ontario) peptide synthesizer. 
Pairs of these oligonucleotides were used as primers in PCR reactions. F. 
heparinum chromosomal DNA was digested with restriction endonucleases 
Sail, Xbal or Not!, and the fragmented DNA combined for use as the 
20 template DNA. Polymerase chain reaction mixtures were produced using 
the DNA Amplification Reagent Kit (Perkin Elmer Cetus, Norwalk, CT). The 
PCR amplifications were carried out in 100 ul reaction volume containing 
50 mM KC1, 10 mM Tris HC1, pH 9, 0.1% Triton X-100, 1.5 mM MgCl 2 , 0.2 
mM of each of the four deoxyribose nucleotide triphosphates (dNTPs), 100 
25 pmol of each primer, 10 ng of fragmented F. heparinum genomic DNA and 
2.5 units of Tag polymerase (Bio/CAN Scientific Inc., Mississauga, Ontario). 
The samples were placed on an automated heating block (DNA thermal 
cycler, Barnstead/Thermolyne Corporation, Dubuque, IA) programmed for 
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step cycles of: denaturation temperature 92°C (1 minute), annealing 
temperatures of 37°C, 42°C or 45°C (1 minute) and extension temperature 
72°C (2 minutes). These cycles were repeated 35 times. The resulting PCR 
products were analyzed on a 1.0% agarose gel containing 0.6 ug/ml 
ethidium bromide, as described by Maniatis, et ai, ibid. DNA fragments 
were produced by oligonucleotides 2-2 and 2-3. The fragments, 250 bp 
and 350 bp in size, were first separated on 1% agarose gel electrophoresis, 
and the DNA extracted from using the GENECLEAN I kit (Bio/CAN Scientific, 
Mississauga, Ontario). Purified fragments were ligated into pTZ/PC (Tessier 
and Thomas, unpublished) previously digested with Notl, Figure 2, and the 
ligation mixture used to transform E. coli FTB1, as described in Maniatis et 
al., ibid. All restriction enzymes and T4 DNA ligase were purchased from 
New England Biolabs (Mississauga, Ontario). 

Strain FTB1 was constructed in our laboratory. The F episome from 
the XL-1 Blue E. coli strain (Stratagene, La Jolla, CA), which carries the lac 
.K repressor gene and produces 10 times more lac repressor than wild 
type E. coli, was moved, as described by J. Miller, Experiments in Molecular 
Genetics, Cold Spring Harbor Laboratory (1972), into the TBI E. coli strain, 
described by Baker, T.A., et al.,Proc. Natl. Acad. Sci. 57:6779-6783 (1984). 
The FTB1 background permits a more stringent repression of transcription 
from plasmids carrying promoters with a lac operator (i.e. lac and Tag 
promoters). Colonies resulting from the transformation of FTB1 were 
selected on LB agar containing ampicillin and screened using the 
blue/white screen provided by X-gal and IPTG included in the agar 
medium, as described by Maniatis, et ai, ibid. Transformants were 
analyzed by colony cracking and mini-preparations of DNA were made for 
enzyme restriction analysis using the.RPM kit (Bio/CAN Scientific Inc., 
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Mississauga. Ontario). Ten plasmids contained inserts of the correct size, 
which were released upon digestion with EcoRl and Hindlll. 

DNA sequencing revealed that one of the plasmids, pCE14, contained 
a 350 bp PCR fragment had the expected DNA sequence as derived from 
peptide 2C DNA sequences were determined by the dideoxy-chain 
termination method of Sanger eial.,Proc. Nail. Acad. Sci. 74:5463-5467 
(1978). Sequencing reactions were carried out with the Sequenase Kit (U.S. 
Biochemical Corp., Cleveland, Ohio) and 35 S ^ AT p (Amersham Canada Ltd., 
Oakville. Ontario, Canada), as specified by the supplier. 

The heparinase H gene was cloned from a F . heparinum chromosomal 
DNA library. Figure 2, constructed as described above. Ten plaque- 
containing filters were hybridized with the DNA probe, produced from the 
gel purified toser, „f. p CE14, which was labeled using a Random Labeling 
Kit (Boehringer Mannheim Canada. Laval, Quebec). Pl„„ e hybridization 
was carried out, as described in Maniatis « al.Jbid,, at 65'C for 16 hours 
in a Tek Star hybridization oven (Bio/CAN Scientific, Mississauga, Ontario). 
Subsequent washes were performed at 65"C: twice for 15 min. in 2X SSC 
once in 2X SSC/0.1% SDS for 30 min. and once in 0.5X SSC/0.1% SDS for 15 
min. Positive plaques were harvested using plastic micropipette tips and 
confirmed by dot blot analysis, as described by Maniatis etal.,ibid. Six of 
the phages, which gave strong hybridization signals, were used for 
Southern hybridization analysis, as described by Southern, E.M., /. M 0 l. Biol 
98:503-517 (1975). This analysis showed that one phage, HIIS, contained 
a 5.5 kb Xbal DNA fragment which hybridized with the probe. Cloning the 
5.5 kb Xbal fragment into the Xbal site of any of following vectors: pTZ/PC 
pBluescrip, (Stratagene. U Jolla CA). pUCIS (described in Yanisch-Perron 
« al. Gene 5*103-119 (1985)). and pOK12 (described in Vierr. and 
Messing. Gene 700:189-194 (1991)). was unsuccessful, even though the 
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FTB1 background was used to repress plasmid promoter-derived 
transcription. Vector, pOK12, a low copy number plasmid derived from 
pACYC184 (approximately 10 copies/cell, Chang, A.C.Y. and Cohen, S.N., /. 
Bact. 754:1141-1156 (1978)) was used in an attempt to circumvent the 
5 toxic effects of a foreign DNA fragment in E. coli by minimizing the number 
at copies, of xhe toxic ioreign. fragment. In .-addition, -insertion -of the entire 
Notl chromosomal DNA insert of the HIIS phage into plasmid pOK12 
plasmid, was unsuccessful. It was concluded that this region of F. 
heparinum chromosome imparts a negative-selective effect on any E. coli 

1 0 cells that harbor it. This toxic affect had not been observed previously 

with other F. heparinum chromosomal DNA fragments. 

A second strategy employed to circumvent the unexpected problem 
of F. heparinum DNA toxicity in E. coli was to digest the chromosomal DNA 
fragment with a restriction endbnuclease which would divide the 
15 fragment, and if possible the heparinase II, gene into two pieces, Figure 2. 
These fragments could be cloned individually. DNA sequence analysis of 
the PCR insert in plasmid, pCE14, demonstrated that BamHl and EcoRl sites 
were present in the insert. Hybridization experiments also demonstrated 
that the BamHl digested F. heparinum DNA in phage HIIS produced two 

2 0 bands 1.8 and 5.5 kb in size. Analysis of hybridization data indicated that 

the 1.8 kb band contains the 5' end and the 5.5 kb band contains the 3' 
end of the gene. Furthermore, a 5 kb EcoRl F. heparinum chromosomal 
DNA fragment hybridized with the PCR probe. The 1.8, 5, and 5.5 kb 
fragments containing heparinase II gene sequences were inserted into 
25 pBhiescript, as described above. Two clones, pBSIB6-7 and pBSIB6-21, 
containing the 5.5 kb BamHl insert in different orientations were isolated 
and one plasmid, pBSIB213, was isolated which contained the 1.8 kb 
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BamHl fragment. No clones containing the 5 kb EcoRl fragment were 
isolated, even though extensive screening of possible clones was done. 

The molecular weight of heparinase II protein is approximately 84 
kD, so the size of the corresponding gene would be approximately 2.4 kb. 
5 The l.S ahS-5.5 kb BamHl chromosomal DNA fragments could include the 
entire heparinase II gene. The.plasmids pBSIB6-7, pBSIB6-21 and 
PBSIB2-13, Figure 2. were used to produce nested deletions with the 
Erase-a-Base system (Promega Biotec, Madison Wis.). These plasmids were 
used as templates for DNA sequence analysis using universal and reverse 
0 pnmers and oligonucleotide primers derived from known heparinase II 
sequence. Because parts of the gene were relatively G-C rich and 
contained numerous strong, secondary structures, the sequence analysis 
was, a, times, performed using reactions in which the dGTP was replaced 
by dITP. Analysis of the DNA sequence. Figure 4, indicated th., there was 
' » «»gie. continuous open reading frame containing codons for 772 amino 
acid residues. Figure 5. Searching for a possible signal peptide sequence 
using Geneworks (Imelligenetics. Mountain View, CA) suggested that there 
are two possible sites for processing of the protein into a mature form- Q- 
26 (glutamine) and D-30 (aspartate). N-terminal amino acid sequencing of 
deblocked, processed heparinase II indicated that the mature protein 
begins with Q-26, and contains 747 amino acids with a calculated 
molecular weight of 84,545 Daltons, Figure 5. 

EXAMPLE 7: Expression of Heparinase n in E. coli 
The vector. pGB, was used for heparinase II expression in E. coli. 
Figures. pGB contains the modified ribosome binding region from pGhep, 
F.gure 1, and a unique BamHl she, whereby expression of a DNA fragment, 
inserted into this site is driven by a double tac promoter. Th. vector also 
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includes a kanamycin resistance gene, and the lac W gene to allow 
induction of transcription with IPTG. Initially, a gel purified 5.5 kb BamHl 
fragment from pBSIB6-21 was ligated with BamHl digested pGB and 
transformed into FTB1, which was selected on LB agar with kanamycin. 
Six of the resulting colonies contained plasmids with inserts in the correct 
orientation for expression of the open reading frame. Pstl digestion and 
rdigHtiDn-ofwWlfic"pIai^ds^ forming pGBIID, deleted 3.5 "kb of the 5.5 
kb BamHl fragment and removed a BamHl site leaving only one BamHl 
site directly after the Shine-Dalgarno sequence. Finally, two synthetic 
oligonucleotides were designed: 5'-TGAGGATTCATGCAAACCAAGGCCGATGT 
GGTTTGGAA-3' (SEQU ID NO:15). and 5'-GGAGGATAACCACATTCGAGCATT-3' 
(SEQU ID NO:16) for use in a PCR to produce a fragment containing a BamHl 
site and an ATG start codon upstream of the mature protein encoding 
sequence and a downstream BamHlsitt, Figure 3. Lambda clone ffll-I, 
isolated at the same time as lambda clone HIIS, was used as template DNA. 

Cloning the blunt-end PCR product into pTZ/PC was unsuccessful, 
using FTB1 as the host. Cloning the BamHl digested PCR product into the 
BamHl site of pBluescript, again using FTB1 as the host, resulted in the 
isolation of 2 plasmids containing the PCR fragment, after screening of 150 
possible clones. One of these, pBSQTK-9, which was sequenced with 
reverse and universal primers, contained an accurate reproduction of the 
DNA sequence from the heparinase II gene. The BamHl digested PCR 
fragment from pBSQTK-9 was inserted into the BamHl site of pGBIID in 
such orientation that the ATG site was downstream of the Shine-Dalgarno 
sequence. This construct, pGBH2, placed the mature heparinase II gene 
under control of the tac promoters in pGB, Figure 3. Strain E. coli 
FTBl(pGBH2) was grown in LB medium containing 50 ug/ml kanamycin at 
37°C for 3 h. Induction of the tac promoter was achieved by adding 1 
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mmol IPTG and the culture placed at either room temperature or 30°C. 
Heparin and heparan sulfate degrading activity was measured in the 
cultures after growth for 4 hours using the method described by Yang et 
al., ibid. Heparin degrading activities of 0.36 and 0.24 IU/mg protein and 
heparan sulfate™ degrading activities of 0.49 and 0.44 IU/mg protein were 
observed at room temperature and 30°C, respectively. 

EXAMPLE 8: Nucleic Acid Encoding Heparinase III 
The amino acid sequence information obtained from peptides 
derived from heparinase HI, Figure 9, purified as described herein, 
reverse translated into highly degenerate oligonucleotides. Therefore, a 
cloning strategy relying on the polymerase chain reaction amplification of 
a section of the heparinase III gene, using oligonucleotides synthesized on 
the basis of amino acid sequence information, required eliminating some of 
the DNA sequence possibilities. An assumed codon usage was calculated 
based on known DNA sequences for genes from other Flavobacterium 
species. Sequences for 17 genes were analyzed and a codon usage table 
was compiled, Table 3. 

Four oligonucleotides were designed by choosing each codon 
20 according to the codon usage table. These were: S'-GAATTCCATCAGTTTCAG 
CCGCATAAA-3' (SEQU ID NO:17), 5'-GAATTCTTTATGCGGCTGAAACrGATG-3' 
(SEQU ID NO:18), 5'-GAATTCCCGCCGGGCGAATTTCATGC-3' (SEQU ID NO:19) 
and 5'-GAATTCGCATGAAATTCGCCCGGCGG-3' (SEQU ID NO:20), and were 
named oligonucleotides 3-1, 3-2, 3-3 and 3-4, respectively. These 
25 oligonucleotides were used in all possible combinations, in an attempt to 
amplify a portion of the heparinase III gene using the polymerase chain 
reaction. The PCR amplifications were carried out as described above. 
Cycles of: denaturation temperature 92° C (1 minute), annealing 



15 



WO 95/34635 



PCT/US95/07391 



33 

temperatures ranging from 37° to 55° C. (1 minute) and extension 
temperature 72° C (2 minutes) were repeated 35 times. Analysis of the 
PCR reactions as described above demonstrated that no DNA fragments 
were produced by these experiments. 
5 A second set of oligonucleotides was synthesized and was comprised 

of _32_base sequences, iiLJKhich_ihe codon usage table was used to guess the 
third position of only half of the codons. The nucleotides within the 
parentheses indicate degeneracies of two or four bases at a single site. 
These were: 

1 0 5'<3G(ACGT)GAATIT^^ (SEQU 
ID NO:21), 

5 l ^T(ACGT)CCATT(AG)TC(ACGT)GGCTGGGCATGAAATTC(ACGT)CC-3 , (SEQU 
IDNO:22), 

5 , <IT(ACGT)CATCAGT^(<^^CAGCQACGT)CATAAAGG(ACGT)TATGG-3 , (SEQU 
15 ID NO:23), and 

5'-CCC^TA(ACGT)CCTrrATG(ACGT^GGCTG(AG)AACTGATG{ACGT)AC-3 
(SEQU ID NO:24), and were named oligonucleotides 3-5, 3-6, 3-7 and 3-8, 
respectively. These oligonucleotides were used in an attempt to amplify a 
portion of the heparinase III gene using the polymerase chain reaction, 

20 and the combination of 3-6 and 3-7 gave rise to a specific 983 bp PCR 
product. An attempt was made to clone this fragment by blunt end 
ligation into E. coli vector, pBIuescript, as well as two specifically designed 
vectors for the cloning of PCR products, pTZ/PC and pCRII from the TA 
cloning TM kj t (inVitrogen Corporation, San Diego, CA). All of these 

25 constructs were transformed into the FTB1 E. coli strain. Transformants 
were first analyzed by colony cracking, and subsequently mini- 
preparations of DNA were made for enzyme restriction analysis. No clones 
containing this PCR fragment were isolated. 
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A third set of oligonucleotides was synthesized incorporating BamHl 
endonuclease sequences on the ends of the 3-6 and 3-7 oligonucleotide 
sequences. A 999 base pair DNA sequence was obtained using the 
polymerase chain reaction with F. heparinum chromosomal DNA as the 
5 target. Attempts were made to clone the amplified DNA into the BamHl 
site of the high copy number plasmid pBluescript and the low copy 
number plasmids pBR322 and P ACYC184. AH of these constructs were 
again transformed into the FTB1E. call strain.' More than 500 candidates 
were screened, yet no transformants containing a plasmid harboring the F. 
) heparinum DNA were obtained. Once again, it was concluded that this 
region of F. heparinum chromosome imparts a negative-selective effect on 
E. coli cells that harbor it. 

As in the case for isolation of the heparinase II gene, the PGR 
fragment was split in order to avoid the problem of foreign DNA toxicity. 
Digestion of the 981 bp Bamtfl-digested heparinase III PCR fragment with 
restriction endonuclease Clal produced two fragments of 394 and 587 bp. 
The amplified F. heparinum region was treated with Clal and the two 
fragments separated by agarose gel electrophoresis. The 587 and 394 base 
pair fragments were ligated separately into plasmid pBluescript that had 
been treated with restriction endonucleases BamHl and Clal. In addition, 
the entire 981 bp PCR fragment was purified and ligated into BamHl cut 
pBluescript. The ligated plasmids were inserted into the XL-1 Blue E. coli. 
Transformants containing plasmids with inserts were selected on the basis 
of their ability to form white colonies on LB-agar plates containing X-gal, 
IPTG and 50 ug/ml ampicillin, as described by Maniatis. Plasmid pFBl 
containing the 587 bp F. heparinum DNA fragment and plasmid P FB2 
containing the entire 981 base pair fragment were isolated by this method. 
The XL-1 Blue strain, which, like strain FTB1, contains the lac N repressor 
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gene on an F episome, allowed for stable maintenance of the complete 
BamHl PCR fragment, unlike FTB1. The reason for this discrepancy is not 
apparent from the genotypes of the two strains (i.e., both are rec A, etc.). 
DNA sequence analysis of the F heparinum DNA in plasmid pFBl 
5 showed that it contained a sequence encoding peptide Hep3-B while the F 
hepqrinuuL insert in plasmid pFB2 contained a DNA-sequence^-encoding 
peptides Hep3-D and Hep3-B, Figure 9. This analysis confirmed that these 
inserts were part of the gene encoding heparinase III. . 

The PCR fragment insert in plasmid pFBl was labeled with 32 P-ATP 
10 using a Random Primed DNA Labeling kit (Boehringer Mannheim, Laval, 
Quebec), and was used to screen the F. heparinum XDASHII library, Figure 
6, constructed as described herein. The lambda library was plated out to 
obtain approximately 1500 plaques, which were transferred to 
nitrocellulose filters (Schleicher & Schuel, Keene, NH). The PCR probe was 
15 purified by ethanol precipitation. Plaque hybridization was carried out 
using the conditions described above. Eight positive lambda plaques were 
identified. Lambda DNA was isolated from lysed bacterial cultures as 
described in Maniatis and further analyzed by restriction analysis and by 
Southern blotting using a Hybond-N nylon membrane (Amersham 
20 Corporation, Arlington Heights, IL) following the protocol described in 

Maniatis. A 2.7 kilobase Hindlll fragment from lambda plaque #3, which 
strongly hybridized to the PCR probe, was isolated and cloned in 
pBluescript, in the XL-1 Blue E. coli background, to yield plasmid 
pHindlllBD, Figure 6. This clone was further analyzed by DNA sequencing. 
25 The sequence data was obtained using successive nested deletions of 

pHindlllBD generated with the Erase-a-Base System (Promega Corporation, 
Madison, WI) or sequenced using synthetic oligonucleotide primers. 
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Sequence analysis revealed a single continuous open reading frame, 
without a translational termination codon, of 1929 base pairs 
corresponding to 643 amino acids. Further screening of the lambda library 
led to the identification of a 673 bp Kpnl fragment which was similarly 
cjoned into to^ ^ 

termination codon was found within the Kpnl fragment adding an extra 51 
base pairs to the heparinase III gene and an additional 16 amino acid to 
the heparinase III protein. The complete heparinase III gene was later 
found to be included within a 3.2 kilobase Pstl fragment from lambda 
plaque #118. The complete heparinase III gene from Flavobacterium is 
thus 1980 base pairs in length, Figure 8, and encodes a 659 amino acid 
protein, Figure 9. N-terminal amino acid sequencing of deblocked, 
processed heparinase III indicated that the mature protein begins with Q- 
25, and contains 635 amino acids with a calculated molecular weight of 
15 73,135 Daltons, Figure 9. 



10 



20 



25 



EXAMPLE 9: Expression of Heparinase III in £. colt 
PGR was used to generate a mature, truncated heparinase III gene, 
which had 16 amino acids deleted from the carboxy-terminus of the 
protein. An oligonucleotide comprised of 5*-CGCGGATCCATGCAAAGCT 
CTTCCATr-3' (SEQU ID NO:25) was designed to insert an ATG start site 
immediately preceding the codon for the first amino acid (Q-25) of mature 
heparinase III, while an oligonucleotide comprised of 5'-CGCGGATCCTCA 
AAGCTTGCCTTTCTC-3' (SEQU ID NO:26), was designed to insert a 
termination codon after the last amino acid of the heparinase III gene on 
the 2.7 kb HindUl fragment. Both oligonucleotides also contained a BamHl 
site. Plasmid pHindlllBD was used as the template in a PGR reaction with 
an annealing temperature of 50°C. A specific fragment of the expected 
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size, 1857 base pairs, was obtained. This fragment encodes a protein of 
620 amino acids with a calculated MW of 71,535 Da. It was isolated and 
inserted in the BamHl site of the expression vector pGB. This construct 
was named pGB-H3A3\ Figure 7. 
> To add the missing 3' region of heparinase in, the BspEllSall 

restriction fragment from pGB-H3A3' was removed and replaced with the 
BspEllSall fragment from pFB5. The construct containing the complete 
heparinase III gene was named pGBH3, Figure 7. Recombinant heparinase 
III is a protein of 637 amino acids with a calculated molecular weight of 
' 73,266 Daltons. E. coli strain XL- 1 Blue(pGBH3) was grown at 37°C in LB 
medium containing 75 ug/ml kanamycin to an OD 600 of 0.5, at which point 
the tac promoter from pGB was induced by the addition of 1 mM IPTG. 
Cultures were grown an additional 2-5 hours at either 23° C, 30° C or 37° C. 
The cells were cooled on ice, concentrated by centrifugation and 
resuspended in cold PBS at 1/1 0th the original culture volume. Cells were 
lysed by sonication and cell debris removed by centrifugation at 10,000 x 
g for 5 minutes. The pellet and supernatant fractions were analyzed for 
heparan sulfate degrading (heparinase III) activity. Heparan sulfate 
degrading activities of 1.29, 5.27 and 3.29 IU/ml were observed from 
cultures grown at 23°, 30° and 37° C, respectively. 

The present invention describes a methodology for obtaining highly 
purified heparin and heparan sulfate degrading proteins by expressing the 
genes for these proteins in a suitable expression system and applying the 
steps of cell disruption, cation exchange chromatography, affinity 
chromatography and hydroxylapatite chromatography. Variations of these 
methods will be obvious to those skilled in the art from the foregoing 
detailed description of the invention. Such modifications are intended to 
come within the scope of the appended claims. 
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TABLE 1 


Purification of heparinase 
Flavobacterium heparinum 


enzymes from 
fermentations- 




sample 


activity 
(IU) 


specific activity 
(IU/mg^ 


yield 
(%) 


fermftnfatinn 

• 111 v 11 VA LIU n 








uv^aiiu uegidmng 

heparan sulfate degrading 


39,700 
75.400 


1.06 
ND 


100 
100 


osmolate 








heparin degrading 
heparan sulfate degrading 


15,749 
42,000 


ND 
ND 


40 
5 6 


cation exchange 
heparin degrading 
heparan sulfate degrading 


12,757 
27.540 


ND 
ND 


32 
37 


cellufine sulfate 








heparin degrading 
heparan sulfate degrading 


8,190 
9,328 


ND 
30.8 


21 
12 


hydroxylapatite 
heparinase 1 
heparinase II 
heparinase III 


7,150 
2,049 
5,150 


115.3 
28.41 
44.46 


1.8 

3 

7 
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TABLE 2 




Propertied of heparinases from 






Flavobacterium 


heparinum 




sample 


heparinase I 






Km (uM) 


17.8 


57 7 


29.4 


Kcat (s-l) 


157 


23.3 


1 64 


substrate 
specificity 


H 


HandHS 


HS 


N-terminal peptide 


OCKKSG 


OTKADV 


osssrr 


glycosylation 


yes 


yes 




H - heparin, HS - heparan sulfate 
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TABLE 3 



Codon usage table for Flavobacterium and Escherichia mli 









L ; consensus. 


codon _ 


amino 


acid codon(s) 




E, coli 


Flavobacterium 


A 


GCT, GCC, GCG, GCA 




GCT 




C 


TGT.TGC 




EITHER 


EITHER 


D 


GAT, GAC 




EITHER 


EITHER 


E 


GAG, GAA 




GAA 


GAA 


p . 


TTC , TTT 




• : EITHER 


TTT 


Q 


GGC, GGA, GGG, GGT 




GGC or GGT 


GGC 


H 


CAC , CAT 




CAT 


CAT 


.1 


. ATC, ATA, ATT 




ATA 


ATC 


K 
L 


AAA, AAG 

CTT, CTA, CTG, TTG, 


TTA, 


AAA 


AAA 




CTC 




CTG 


CTG 


M 


ATG 




ATG 


ATG 


N 


AAC, AAT 




. ■ AAC 




P 


CCC, CCT, CCA, CCG 




CCG 


CCG 


Q 
R 


CAG, CAA 

CGT, AG A, CGC, CGA, 
CGG 


AGG, 


CAG 
CGT 


CAG 
CGC 


S.. 


TCA, TCC, TCG, TCT, 
AGT 


AGC, 


TCT 


? 


T 


ACG, ACC, ACT. ACA 




ACC or ACT 


ACC or ACA 


V 


GTC, GTA, GTT, GTG 




GTT 


o 


W 


TGG ■ 




TGG 


TGG 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT (S) : IBEX TECHNOLOGIES and 
ZIMMERMANN, Joseph 



(ii) TITLE OF INVENTION: Nucleic Acid Sequences And Expression 

Systems For Heparinase II And Heparinase III Derived From 
Flavobacterium heparinum 

f ii i ) -NUMBER- -0F-&ESUENeESr-26 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Hale and Dorr 

(B) STREET: 1455 Pennsylvania Avenue, N.W. 

(C) CITY: Washington, D.C. 

(E) COUNTRY: U.S.A. 

(F) ZIP: 20004 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible : 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/US95/07391 

(B) FILING DATE: 09-JUNE-1995 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) "APPLICATION NUMBER: 08/258,639 
(B) FILING DATE*. 10 JUNE 1994 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: BAKER, Hollie L. 

(B) REGISTRATION NUMBER: 31,321 

(C) REFERENCE/DOCKET NUMBER: 104385. 116PCT 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (202)942-8400 : 

(B) TELEFAX: (202)942-8484 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2339 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
ATGAAAAGAC AATTATACCT GTATGTGATT TTTGTTGTAG TTGAACTTAT GGTTTTTACA 
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ACAAAGGGCT 
ATGCCCATAC 
CTGAAAAACA 
CAGGAAGACT 
CAGAAAGGGC 



ATTCCCAAAC 
CCCCTAAGAC 
GGATGAACGA 
GGAAGCCAGC 
TTACTGTAAG 



~ CCAAAGGTAG 
AAACCAGCAG 
TGGTTACAGG 
CACGTTTTGT 
TAAAAGACAA 
CTGTAGGGAT 
TTTTGAAAGA 
GTATGTCATA 
GGATGGGCGC 
ATAAACGCCG 
AACCAAAATA 
TTAATTACGA 
TATGGCGCGA 
CAGGATCGCC 
CAGAGATGAA 
TCCAGATCTA 
GAGGTTATAA 
TGCTGATTTA 
CCGATTTTGC 
GCGACCTTAA 
TTGGTCCGGA 
ATTCGGCAAA 
AAGTTCCGGC 
AGTTCTGGTT 
AACGTACAAA 
CCAATTCAAA 



GACGGGAAGC 
GTGATATTTC 
GGCCATTGTG 
GAAGGCATTT 
GTCTATTGTT 
TGCCATTTAC 
ACACCTGGTT 
CCTGAACGTA 
TGGTAATGTG 
CCCCGATGGA 
TTATACGATG 
ATTCCTGAAA 
TACCCAGTTG 
TTTTGGATGG 
AGTCAACGAA 
TTACAAAGGC 
CAGTCCGCAC 
CGATCCTAAA 
TGCCAACGAT 
AGAAATGCTG 
TAACCAAACC 
AGTGAAGGAA 
AGCGATGATC 
GTTGCACAGT 
AAACGGTGAT 
CATTACCTCC 



CAAGGCCGAT GTGGTTTGGA AAGACGTGGA TGGCGTATCT 120 

CCACCCGCGT TTGTATCTAC GTGAGCAGCA AGTTCCTGAC 180 

CCCTAAACTG AAAAAAGTTT GGGCCGATAT GATCAAGATG 240 

TGATATTCCT GAAGTTAAAG ACTTTCGTTT TTATTTTAAC 300 

GGTTGAACTA ATGGCCCTGA ACTATCTGAT GACCAAGGAT 360 

CATCACTTCA ATTATTGATA. CCCTTGAAAC TGCAACTTTT 420 

GAGAGGGATA GTGATATTTC GAGAGGGATA GGCCTGTTTA 480 

TATGACTGGT GCTACGATCA GCTGAAACCA GAAGAGAAAA 5 40 

GTGAGGCTGG CCAAAATGCT CGAATGTGGT TATCCTCCGG 600 

GGGCATGCTT CCGAATGGAT GATCATGCGG GACCTGCTTT 660 

GATGAATTCC CTGAGATGTA TAACCTGGCT GCGGGTCGTT 720 

GCCCGCAACT GGTTTTATCC CTCGCATAAC TACCATCAGG 780 

AGATTTACCA ACGACCTTTT TGCCCTCTGG ATATTAGACC 840 

TTTAATCCAG GGCAGCAGTT TATCCTTTAT GACGCGATCT 900 

CAGATTTTAG CAGGTGGAGA TGTAGATTAT TCCAGGAAAA 360 

CCTGCATTGC TTGCAGGTAG CTATTATAAA GATGAATACC 1020 

GATCCCAATG TTGAGCCACA' TTGCAAATTG TTCGAATTTT 1080 

GGAAGTCGTA AGCCTGATGA TTTGCCACTT TCCAGGTACT 1140 

ATGATTGCCC GTAqCGGATG GGGTCCGGAA AGTGTGATTG 1200 

TATTCCTTTC TTAACCATCA GCATCAGGAT GCAGGAGCCT 1260 

CCGCTGGCCA TAGATGCAGG CTCGTATACA GGTTCTTCAG 1320 

AACAAGAACT TTTTTAAGCG GACTATTGCA CACAATAGCT l 38 0 

GAAACTTTCA GTTCGTCGGG ATATGGTGGA AGTGACCATA 1440 

GGTGGTCAGC GGCTGCCCGG AAAAGGTTGG ATTGCACCCC . 1500 

GCAGGCGATT TCAGGACCGG CAAAATTCTT GCCCAGGGCT 1560 

CCTGATTATA CTTATCTGAA AGGAGACATT ACAGCAGCTT 1S20 

GTAAAACGTT CATTTCTATT CCTGAACCTT AAGGATGCCA 1680 

GTTTTTGACA AGGTAGTTGC TTCCAATCCT GATTTTAAGA 1740 

ATTGAGCAGC CTGAAATAAA GGGGAATCAG ATTACCATAA 1800 

AGTGGGATGT TGGTGAATAC GGCTTTGCTG CCGGATGCGG I860 

ATTGGCGGCA AGGGCAAAGA CTTCTGGGTG TTTGGTACCA 1 920 
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ATTATACCAA TGATCCTAAA CCGGGCACGG ATGAAGCATT GGAACGTGGA GAATGGCGTG 1980 

TGGAAATCAC TCCAAAAAAG GCAGCAGCCG AAGATTACTA CCTGAATGTG ATACAGATTG 2040 

CCGACAATAC ACAGCAAAAA TTACACGAGG TGAAGCGTAT TGACGGTGAC AAGGTTGTTG 2100 

GTGTGCAGCT TGCTGACAGG ATAGTTACTT TTAGCAAAAC TTCAGAAACT GTTGATCGTC 2160 

CCTTTGGCTT TTCCGTTGTT GGTAAAGGAA CATTCAAATT TGTGATGACC GATCTTTTAG 2220 

CGGGTACCTG GCAGGTGCTG AAAGACGGAA AAATACTTTA TCCTGCGCTT TCTGCAAAAG 2280 
GTGATGATGG ACCCCTTTAT TTTGAAGGAA CTGAAGGAAC CTACCGTTTT TTGAGATAA 



(2) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 772 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Met Lys Arg Gin Leu -Tyr Leu Tyr Val lie Phe Val Val Val Glu Leu 

Met Val Phe Thr Thr Lys Gly Tyr Ser Gin Thr Lys Ala Asp Val Val 
20 25 30 

Trp Lys Asp Val Asp Gly Val Ser Met Pro lie Pro Pro Lys Thr His 
35 40 45 

Pro Arg Leu Tyr Leu "Arg Glu Gin Gin Val Pro Asp Leu Lys Asn Arg 

Met Asn Asp Pro Lys Leu Lys Lys Val Trp Ala Asp Met lie Lys Met 
65 70 75 eo 

Gin Glu Asp Trp Lys Pro Ala Asp lie Pro Glu Val Lys Asp Phe Arg 
85 go 95 

Phe Tyr Phe Asn Gin Lys Gly Leu Thr Val Arg Val Glu Leu Met Ala 
100 105 no 

Leu Asn Tyr Leu Met Thr Lys Asp Pro Lys Val Gly Arg Glu Ala He 
115 120 125 

Thr Ser He He Asp Thr Leu Glu Thr Ala Thr Phe Lys Pro Ala Gly 
130 135 140 

Asp He Ser Arg Gly He Gly Leu Phe Met Val Thr Gly Ala He val 
145 WO 155 160 

• Tyr Asp Trp Cys Tyr Asp Gin Leu Lys Pro Glu Glu Lys Thr Arg Phe 
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Val Lys Ala Phe Val Arg Leu Ala Lys Met Leu Glu Cys Gly Tyx Pro 
180 185 190 

Pro val Lys Asp Lys Ser He Val Gly His Ala Ser Glu Trp Met lie 
±sa 200 _ 205 

Met Arg Asp Leu Leu Ser Val Gly lie Ala lie Tyr Asp Glu Phe Pro 
215 220 

Glu Met Tyr Asn Leu Ala Ala Gly Arg Phe Phe Lys Glu His Leu Val 
-- .- -23G-- -"MS" ----- ~ 

Ala Arg Asn Trp Phe Tyr Pro Ser His Asn Tyr His Gin Gly Met Ser 

245 250 255 

Tyr Leu Asn Val Arg Phe Thr Asn Asp Leu Phe Ala Leu Trp He Leu 
260 265 270 

Asp Arg Met Gly Ala Gly Asn Val Phe Asn Pro Gly Gin Gin Phe He 
275 280 285 

Leu Tyr Asp Ala lie Tyr Lys Arg Arg Pro Asp Gly Gin lie Leu Ala 
295 300 

Gly Gly Asp Val Asp Tyr Ser Arg Lys Lys Pro Lys Tyr Tyr Thr Met 

Pro Ala Leu Leu Ala Gly Ser Tyr Tyr Lys Asp Glu Tyr Leu Asn Tyr 
330 ' 335 

Glu Phe Leu Lys Asp Pro Asn Val Glu Pro His Cys Lys Leu Phe Glu 
. . .345 '350 

Phe Leu Trp Arg Asp Thr Gin Leu Gly Ser Arg Lys Pro Asp Asp Leu 
55 360 365 

Pro Leu Ser Arg Tyr Ser Gly Ser. Pro. Phe Gly Trp Met lie Ala Arg 
375 38O 

Thr Gly Trp Gly Pro Glu Ser Val lie Ala Glu Met Lys Val Asn Glu 

395 400 
Tyr Ser Phe Leu Asn His Gin' His Gin Asp Ala Gly Ala Phe Gin lie 
4Ui » 410 415 

Tyr Tyr Lys Gly Pro Leu Ala lie Asp Ala Gly Ser Tyr Thr Gly Ser 
420 425 430 

Ser Gly Gly Tyr Asn Ser Pro His Asn Lys Asn Phe Phe Lys Arg Thr 
■* a 440 445 

He Ala His Asn Ser Leu Leu lie Tyr Asp Pro Lys Glu Thr Phe Ser 
■ 3U 455 46O 

Ser Ser Gly Tyr Gly Gly Ser Asp His Thr Asp Phe Ala Ala Asn Asp 
5 470 475 4f * 

Gly Gly Gin Arg Leu Pro Gly Lys Gly Trp lie Ala Pro Arg Asp Leu 
485 490 495 

Lys Glu Met Leu Ala Gly Asp Phe Arg Thr Gly Lys He Leu Ala Gin 
500 505 5io 
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Gly Phe Gly Pro Asp Asn Gin Thr Pro Asp Tyr Thr Tyr Leu Lys Gly 
515- 520 525 

Asp lie Thr Ala Ala Tyr Ser Ala Lys Val Lys Glu Val Lys Arg Ser 
53 0 535 540 

Phe Leu Phe Leu Asn Leu Lys Asp Ala Lys Val Pro Ala Ala Met lie 
545 550 555 560 

Val Phe Asp Lys Val Val Ala Ser Asn Pro Asp Phe Lys Lys Phe Trp 
565 - 570 575 

Leu Leu His Ser He Glu Gin Pro Glu He Lys Gly Asn Gin He Thr 

; ^ — "sis - - — - 

He Lys Arg Thr Lys Asn Gly Asp Ser Gly Met Leu Val Asn Thr Ala 
595 600 605 

Leu Leu Pro Asp Ala Ala Asn Ser Asn He Thr Ser He Gly Gly Lys 
610 615 620 

Gly Lys Asp Phe Trp Val Phe Gly Thr Asn Tyr Thr Asn Asp Pro Lys 
625 630 635 640 

Pro Gly Thr Asp Glu Ala Leu Glu Arg Gly Glu Trp Arg Val Glu He 
645 650 655 

Thr Pro Lys Lys Ala Ala Ala' Glu Asp Tyr Tyr Leu Asn Val He Gin 
660 665 670 

He Ala Asp Asn Thr Gin Gin Lys Leu His Glu Val Lys Arg lie Asp 
675 680 685 

Gly 22? Lys Val Val G .ly Val" Gin Leu Ala Asp Arg He Val Thr -Phe 
690 695 7oo 

Ser Lys Thr Ser Glu Thr Val Asp Arg Pro Phe Gly Phe Ser Val Val 
705 t 710 715 720 

Gly Lys Gly Thr Phe Lys Phe Val Met Thr Asp Leu Leu Ala Gly He 
725 730 735 

Trp Gin Val Leu Lys Asp Gly Lys He Leu Tyr Pro Ala Leu Ser Ala 
"40 745 750 

Lys Gly Asp Asp Gly Pro Leu Tyr Phe Glu Gly Thr Glu Gly Thr Tyr 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1980 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: UNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



AT6ACTACGA AMTTTTTAA AAGGATCATT GTATTTGCTG TAATTGCCCT 


50 


ATCGTCGGGA AATATACTTG CACAAAGCTr Trrr*Tv*nn , nn , 

wiu««aijt.iL IlCCATTACC AGGAAAGATT 


100 


-TTGACCAGAT-GAAGGTTGAS^AWPPF^a^ 

«i-<v«i u " uccl «3«cf x a ixcceuAC TGGAAAAGGT TAATAAAGCA 


150 


GTTGCTGCCG GCAACTATrta rnAwnnnnnn *** em, 

uu u iAHaft LliATGCGGCC AAAGCATTAC TGGCATACTA 


200 


CAGGGAAAAA AGTAAGnrna fiprnarvw* 

-ouuunnnnn rtu j..fifiiji J i_i_A (jbUAACCTGA TTTCAGTAAT GCAGAAAAGC 


250 


CTGCCGATAT APOPPanPf/"" iT>r>>imi>nn _ • 

AUL!L ' L - AtjLC( - ATAGATAAGG TTACGCGTGA AATGGCCGAC 


300 


AAGGCTTTGG TCCACCACITV T^tinpnnun _ 

uu *w-«WJKaTT TCAACCGCAC AAAGGCTACG GCTATTTTGA 


350 


TTATGGTAAA naPaTPanrr nnm^tmnmn ■ 

Uijtwi vaHUft i LAACT GGCAGATGTG GCCGGTAAAA GACAATGAAG 


400 


U fcTTGCACCGT GTAAAATGGT GGCAGGCTAT GGCCCTGGTT 


450 


x«x«_ttuv>i.iA UaWSCGATGA AAAATATGCA AGAGAATGGG TATATCAGTA 


500 


CAGCGATTGG GCCAGAAAAA ACCCATTGGG CCTGTCGCAG GATAATGATA 


550 


AATTTGTGTG GCGGCCCCTT GAAGTGTCGG ACAGGGTACA AAGTCTTCCC 


600 


CCAACCTTCA GCTTATTTGT AAACTCGCCA GCCTTTACCC CAGCCTTTTT 


650 


AATGGAATTT TTAAACAGTT ACCACCAACA GGCCGATTAT TTATCTACGC 


700 


ATTATGCCGA ACAGGGAAAC CACCGTTTAT TTGAAGCCCA ACGCAACTTG 


750 


TTTGCAGGGG TATCTTTCCC TGAATTTAAA GATTCACCAA GATGGAGGCA 


800 


AACCGGCATA TCGGTGCTGA ACACCGAGAT CAAAAAACAG GTTTATGCCG 


850 


ATGGGATGCA GTTTGAACTT TCACCAATTT ACCATGTAGC TGCCATCGAT 


900 


A ^ AbGCCTATGG TTCTGCAAAA CGAGTTAACC TTGAAAAAGA 


950 


ATTTCCGCAA Trrnnwsp »mnnni>n, «. 

^xiiv-v-ui^iA iCTTATGTAC AAACTGTAGA AAATATGATT ATGGCGCTGA 


xooo 


TCAGTATTTC ACTGCCAGAT TOTiansnpr i— mirm-m-i 

1ATAACACCC CTATGTTTGG AGATTCATGG 


1050 


ATTACAGATA AAAATTTCAG GATGGCACAG TTTGCCAGCT GGGCCCGGGT 


1100 


TTTCCCGGCA AACCAGGCCA TAAAATATTT TGCTACAGAT GGCAAACAAG 


1150 


GTAAGGCGCC TAACTTTTTA TCCAAAGCAT TGAGCAATGC AGGCTTTTAT 


1200 


ACGTTTAGAA GCGGATGGGA TAAAAATGCA ACCGTTATGG TATTAAAAGC 


1250 


CAGTCCTCCC GGGGAATTTC ATGCCCAGCC GGATAACGGG ACTTTTGAAC 


1300 


TTTTTATAAA GGGCAGAAAC TTTACCCCAG ACGCCGGGGT ATTTGTGTAT 


1350 


AGCGGCGACG AAGCCATCAT GAAACTGCGG AACTGGTACC GTCAAACCCG 


1400 


CATACACAGC ACGCTTACAC TCGACAATCA AAATATGGTC ATTACCAAAG 


1450 
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CCCGGCAAAA CAAATGGGAA ACAGGAAATA ACCTTGATGT GCTTACCTAT 


1500 


ACCAACCCAA GCTATCCGAA TCTGGACCAT CAGCGCAGTG TACTTTTCAT 


1550 


CAACAAAAAA TACTTTCTGG TCATCGATAG GGCAATAGGC GAAGCTACCG 


1600 


GAAACCTGGG CGTACACTGG CAGCTTAAAG AAGACAGCAA CCCTGTTTTC 


1650 


GATAAGACAA AGAACCGGGT TTACACCACT TACAGAGATG GTAACAACCT 


1700 


GATGATCCAA TCGTTGAATG CGGACAGGAC CAGCCTCAAT GAAGAAGAAG 


1750 


GAAAGGTATC TTATGTTTAC AATAAGGAGC TGAAAAGACC TGCTTTCGTA 


1800 


TTTGAAAAGC CTAAAAAGAA TGCCGGCACA CAAAATTTTG TCAGTATAGT 


1850 


TTATCCATAC GACGGCCAGA AGGCTCCAGA GATCAGCATA CGGGAAAACA 


1900 


AGGGCAATGA TTTTGAGAAA GGCAAGCTTA ATCTAACCCT TACCATTAAC ' 


1950 


GGAAAACAAC AGCTTGTGTT GGTTCCTTAG 


1980 



(2 J INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 659 amino acids 

(B) TYPE: amino acid 
ID) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 4: 

Met Thr Thr Lys lie Phe Lys Arg He lie Val Phe Ala Val lie Ala 

Leu Ser Ser Gly Asn He Leu Ala Gin Ser Ser Ser He Thr Arg Lvs 
20 25 3 0 9 yS 

Asp Phe Asp His He Asn Leu Glu Tyr Ser Gly Leu Glu Lys Val Asn 
35 40 • 45 

Lys Ala Val Ala Ala Gly Asn Tyr Asp Asp Ala Ala Lys Ala Leu Leu 
50 .55 .60 

Ala Tyr Tyr Arg Glu Lys Ser Lys Ala Arg Glu Pro Asp Phe Ser Asn 
65 ™ . 75 80 

Ala Glu Lys Pro Ala Asp He Arg Gin Pro He Asp Lys Val 'Thr Arg 
85 go g 5 

Glu Met Ala Asp Lys Ala Leu Val His Gin Phe Gin Pro His Lys Gly 

100 105 110 

Tyr Gly Tyr Phe Asp Tyr Gly Lys Asp lie Asn Trp Gin Met Trp Pro 
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Val Lys Asp Asn Glu Val Arg Trp Gin Leu His Arg Val Lys Trp Trp 
130 135 140 

Gin Ala Met Ala Leu Val Tyr His Ala Thr Gly Asp Glu Lys Tyr Ala 
145 150 155 * 160 

. . "^3 Glu Tr P y«l ?V=LGln Tyr Ser Asp .Trp Ala Arg Lys.. Asn I>ro- Leu - 
115 " 170 i7 5 . 

Gly Leu Ser Gin Asp Asn Asp Lys Phe Val Trp Arg Pro Leu Glu Val 
180 165 i9o 

Ser Asp Arg Val Gin Ser Leu Pro Pro Thr Phe Ser Leu Phe Val Asn 
195 . 200 205 

Ser Pro Ala Phe Thr Pro Ala Phe Leu Met Glu Phe Leu Asn Ser Tyr 
210 215 220 

His Gin Gin Ala Asp Tyr Leu Ser Thr His Tyr Ala Glu Gin Gly Asn 
225 230 235 * 240 

His Arg Leu Phe Glu Ala Gin Arg Asn Leu Phe Ala Gly Val Ser Phe 
245 250 255 

Pro Glu Phe Lys Asp Ser Pro. Arg Trp Arg Gin Thr Gly He Ser Val 
260 265 270 

Leu AM ,Thr Glu lie Lys Lys Gin Val Tyr. Ala Asp Gly Met Gin Phe 
275 280 285 

Glu Leu Ser Pro lie Tyr His Val Ala Ala lie Asp lie Phe Leu Lys 
" u 295 300 

Ala Tyr Gly Ser Ala Lys Arg Val Asn Leu Glu Lys Glu Phe Pro Gin 
310 ... 315 > ■ 320 

Ser Tyr Val Gin Thr Val Glu Asn Met lie Met Ala Leu He Ser lie 
325 330 335 

Ser Leu Pro Asp Tyr Asn Thr Pro Met Phe Gly Asp Ser Trp lie Thr 
340 345 . 350 

Asp Lys Asn Phe Arg Met Ala: Gin Phe Ala Ser Trp Ala Arg Val Phe 
•« 3 ■ 360 365 

Pro Ala Asn Gin Ala lie Lys Tyr Phe Ala Thr Asp Gly Lys Gin Gly 

Lys Ala Pro Asn Phe Leu Ser Lys Ala Leu Ser Asn Ala Gly Phe Tyr 
385 390 395 * 4o0 

Thr Phe Arg Ser Gly Trp Asp Lys Asn Ala Thr Val Met Val Leu Lys 

405 "0 415 

Ala Ser Pro Pro Gly Glu Phe His Ala Gin Pro Asp Asn Gly Thr Phe 
425 43 0 

Glu Leu Phe He Lys Gly Arg Asn Phe Thr Pro Asp Ala Gly Val Phe 
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Val Tyr Ser Gly Asp Glu Ala lie Met Lys Leu Arg Asn Trp Tyr Arg 
450 455 460 

Gin Thr Arg He His Ser Thr Leu Thr Leu Asp Asn Gin Asn Met Val 
465 470 475 480 

He Thr Lys Ala Arg Gin Asn Lys Trp Glu Thr Gly Asn Asn Leu Asp 

485 490 495 

Val Leu Thr Tyr Thr Asn Pro Ser Tyr Pro Asn Leu Asp His Gin Arg 
500 505 510 

Ser Val Leu Phe He Asn Lys Lys Tyr Phe Leu Val He Asp Arg Ala 

515 520 - - 525 - - - - 

He Gly Glu Ala Thr Gly Asn Leu Gly Val His Trp Gin Leu Lys Glu 
530 535 540 

Asp Ser Asn Pro Val Phe Asp Lys Thr Lys Asn Arg Val Tyr Thr Thr 
545 550 555 560 

Tyr Arg Asp Gly Asn Asn Leu Met He Gin Ser Leu Asn Ala Asp Arg 
565 570 575 

Thr Ser Leu Asn Glu Glu Glu Gly Lys Val Ser Tyr Val Tyr Asn Lys 
580 585 590 

Glu Leu Lys Arg Pro Ala Phe Val Phe Glu Lys Pro Lys Lys Asn Ala 
595 600 605 

Gly Thr Gin Asn Phe Val Ser He Val Tyr Pro Tyr Asp Gly Gin Lys 
610 615 620 

Ala Pro Glu He Ser He Arg Glu Asn Lys Gly Asn Asp Phe Glu Lys 
625 630 635 640 

Gly Lys Leu Asn Leu Thr Leu Thr He Asn Gly Lys Gin Gin Leu Val 
645 650 655 

Leu Val Pro 

• (2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 5: 

Glu Phe Pro Glu Met Tyr Asn Leu Ala Ala Gly Arg 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 12 amino acids 



SUBSTITUTE SHEET (RULE 26) 



WO 95/34635 



PCT/US95/07391 



(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
. Lya-Pro.Ala-Asp- He Pro Gluu V-al-Lys Asp Giy- Arg 



(2) INFORMATION FOR SEQ ID N0:7":" 

<i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH :' 27 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 




Pro Asp Asn Gin Thr Pro Asp Tyr Thr Tyr Leu 
20 25 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Leu He Lys Asn Glu Val Arg Trp Gin Leu His Arg Val Lys 
5 10 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 9: 

Val Leu Lys Ala Ser Pro Pro Gly Glu Phe His Ala Gin Pro Asp Asn 
5 10 is 
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Gly Thr Phe Glu Leu Phe lie 
20 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Lys Ala Leu Val His Trp Phe Trp Pro His Lys Gly Tyr Gly Tyr Phe 
1 5 10 15 

Asp Tyr Gly Lys Asp lie Asn 
20 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GAATTCCCTG AGATGTACAA TCTGGCCGC 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CCGGCAGCCA GATTGTACAT TTCAGG 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs ' 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double' 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
AAACCCGCCG ACATTCCCGA AGTAAAAGA 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
CGAAAGTCTT TTACTTCGGG AATGTCGGC 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
TGAGGATTCA TGCAAACCAA GGCCGATGTG GTTTGGAA 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
GGAGGATAAC CACATTCGAG CATT 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
GAATTCCATC AGTTTCAGCC GCATAAA 
(2) INFORMATION FOR SEQ ID NO: 18: 
(i) SEQUENCE CHARACTERISTICS : 



(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
GAATTCTTTA TGCGGCTGAA ACTGATG 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 &* base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
GAATTCCCGC CGGGCGAATT TCATGC 
(2). INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

GAATTCGCAT GAAATTCGCC CGGCGG 

(2) INFORMATION FOR SEQ ID NO:2l: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 29 base pairs 
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(B) TYPE-, nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
GGGAATTTCC ATGCCCAGCC GAAATGGAC 
(2) INFORMATION FOR SEQ ID NO: 22: 

(i)" SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
GTCCATTTCG GCTGGGCATG AAATTCCC 
(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 
GTCATCAGTT CAGCCCATAA AGGTATGG 
(2) INFORMATION FOR SEQ ID NO:24: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear -■■ 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
CCCATACCTT ATGGGCTGAA CTGATGAC 
(2) INFORMATION FOR SEQ ID NO:25: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear . 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:25: 
CGCGGATCCA TGCAAAGCTC TTCCATT 27 
(2) INFORMATION FOR SEQ ID NO:2G: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 26: 
CGCGGATCCT CAAAGCTTGC CTTTCTC 
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We claim: 



1. A recombinant nucleic acid sequence which encodes heparinase II 
from Flavobacterium heparinum. 

2. The nucleic acid sequence of claim 1 comprising the sequence of 
SEQUIDNOrl. 



3. The nucleic acid sequence of claim 1 further comprising a nucleic 
acid sequence capable of directing the expression of said heparinase. 

4. The nucleic acid sequence of claim 3 comprising a modified ribosome 
binding region. 

5. A host cell transformed with a vector comprising the nucleic acid 
sequence of claim 3. said host cell being capable of heparinase H. 



6. The host cell of claim 5, wherein said host cell is £. 



coli. 



7. A recombinant nucleic acid sequence which encodes heparinase III 
from Flavobacterium heparinum. 



8. The nucleic acid sequence of claim 7 comprising the sequence of 
SEQUIDNO:3. 



9. The nucleic acid sequence of claim 7 further comprising a nucleic 
ac,d sequence capable of directing the expression of said heparinase. 
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10. The nucleic acid sequence of claim 9 comprising a modified ribosome 
binding region. 



11. A host cell transformed with a vector comprising the nucleic acid 
sequence of claim 9, said host cell being capable of expressing heparinase 
III. 



12. The host cell of claim 11, wherein said host cell is E. coli. 

13. Isolated, recombinant heparinase II in substantially pure form. 

14. The heparinase II of claim 13 comprising the amino acid sequence of 
SEQUE>NO:2. 

15. Isolated, recombinant heparinase III in substantially pure form. 

16. The heparinase III of claim 15 comprising the amino acid sequence 
ofSEQUIDNO:4. 

17. An expression vector for the expression of heparinases comprising a 
modified ribosome binding region containing a Shine-Dalgarno sequence, a 
spacer region between the Shine-Dalgarno sequence and the ATG start 
codon, and a recombinant nucleotide sequence encoding heparinase I, II or 
III. 



18. The expression vector of claim 17 wherein the Shine-Dalgarno 
sequence is 5 base pairs in length. 
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19. The expression vector of claim 17 wherein the spacer region between " 
the Shine-Dalgarno sequence and the ATG start codon is 9 base pairs in 
length. 

20. A method of expressing genes from Flavobacterium species 
comprising constructing the expression vector of claim 17 and 
transforming a prokaryote host cell with said expression vector. 

21. The method of claim 20 wherein said expression vector encodes 
heparinase I. 



22. The method of claim 20 wherein said expression vector encodes 
heparinase II. 



23. The method of claim 20 wherein said expression vector encodes 
heparinase III. 

24. An antibody isolated from animals injected with a heparinase from F. 
heparinum which are specific for the amino acid sequences of the 
heparinase. 

25. The antibody of claim 24 wherein said heparinase is heparinase I. 

26. The antibody of claim 24 wherein said heparinase is heparinase II. 



antibody of claim 24 wherein said heparinase is heparinase III. 
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28. An antibody isolated from animals injected with a heparinase which 
is specific for non-amino acid moities of post-translationally modified F. 
heparinum proteins. 

29. The polyclonal antibody of claim 28 wherein said heparinase is 
Ireparinase- I. 

30. The polyclonal antibody of claim 28 wherein said heparinase is 
heparinase II. 

31. The polyclonal antibody of claim 28 wherein said heparinase is 
heparinase III. 

32. A method of purifying heparinases from Flavobacterium heparinum 
comprising the steps of culturing F. heparinum cells, disrupting the cells, 
and performing cation exchange chromatography, affinity chromatography 
and hydroxylapatite chromatography. 
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Pfihep AfifiAAACAGAATTCATG 
S-D lOnt 

PGhep AfifiASACAGAATTCAlfi 
S-D 9nt 
pA4hep AfifiAfiAATTCAlS 

S-D 5nt 
PGB AfifiAfiACA£GAIC£ 
S-D 



RG. 1 
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ATGAAAAGAC AATTATACCT GTATGTGATT TTTGTTGTAG TTGAACTTAT GGTTTTTACA 60 

ACAAAGGGCT ATTCCCAAAC CAAGGCCGAT GTGGTTTGGA AAGACGTGGA TGGCGTATCT 120 

ATGCCCATAC CCCCTAAGAC CCACCCGCGT TTGTATCTAC GTGAGCAGCA AGTTCCTGAC 180 

CTGAAAAACA GGATGAACGA CCCTAAACTG AAAAAAGTTT GGGCOGATAT GATCAAGATG 240 

CAGGAAGACT GGAAGCCAGC TGATATTCCT GAAGTTAAAG ACTTTCGTTT TTATTTTAAC 300 

CAGAAAGGGC TTACTGTAAG GGTTGAACTA ATGGCCCTGA ACTATCTGAT GACCAAGGAT 360 

CCAAAGGTAG GACGGGAAGC CATCACTTCA ATTATTGATA CCCTTGAAAC TGCAACTTTT 420 

AAACCAGCAG GTGATATTTC GAGAGGGATA GGCCTGTTTA TGGTTACAGG GGCCATTGTG 480 

TATGACTGGT GCTACGATCA GCTGAAACCA GAAGAGAAAA CACGTTTT6T GAAGGCATTT 540 

GTGAGGCTGG CCAAAATGCT CGAAtGTGGT TATCCTCCGG TAAAAGACAA GTCTATTGTT 600 

GGGCATGCTT CCGAATGGAT GATCATGCGG GACCTGCTTT CTGTAGGGAT TGCCATTTAC 660 

GATGAATTCC CTGAGATGTA TAACCTGGCT GCGGGTCGTT TTTTCAAAGA ACACCTGGTT 720 

GCCCGCAACT GGTTTTATCC CTCGCATAAC TACCATCAGG GTATGTCATA CCTGAACGTA 780 

AGATTTACCA ACGACCTTTT TGCCCTCTGG ATATTAGACC GGATGGGCGC TGGTAATGTG 840 

TTTAATCCAG GGCAGCAGTT TATCCTTTAT GACGCGATCT ATAAACGCCG CCCGGATGGA 900 

CAGATTTTAG CAGGTGGAGA TGTAGATTAT TCCAGGAAAA MCCAAAATA TTATACGATG 960 

CCTGCATTGC TTGCAGGTAG CTATTATAAA GATGAATACC TTMTTACGA ATTCCTGAAA 1020 

GATCCCAATG TTGAGCCACA TTGCAAATTG TTCGAATTTT TATGGCGGGA TACCCAGTTG 1080 

GGAAGTCGTA AGCCTGATGA TTTGCCACTT TCCAGGTACT CAGGATCGCC TTTTGGATGG 1140 

ATGATTGCCC GTACCGGATG GGGTCCGGAA AGTGTGATTG CAGAGATGAA AGTCAACGAA 1200 

FIG.4A 
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TATTCCTTTC TTAACCATCA GCATCAGGAT GCAGGAGCCT TCCAGATCTA TTACAAAGGC 
CCGCTGGCCA TAGATGCAGG CTCGTATACA GGTTCTTCAG GAGGTTATAA CAGTCCGCAC 
AACAAGAACT TTTTTAAGCG GACTATTGCA CACAATAGCT TGCTGATTTA CGATCCTAAA 
GAAACTTTCA GTTCGTCGGG ATATGGTGGA AGTGACCATA CCGATTTTGC TGCCAACGAT 
GGTGGTCAGC GGCTGCCCGG AAAAGGTTGG ATTGCACCCC GCGACCTTAA AGAAATGCTG 
GCAGGCGATT TCAGGACCGG CAAAATTCTT GCCCAGGGCT TTGGTCCGGA TAACCAAACC 
CCTGATTATA CTTATCTGAA AGGAGACATT ACAGCAGCTT ATTCGGCAAA AGTGAAGGAA 
GTAAAACGTT CATTTCTATT CCTGMCCTT AAGGATGCCA AAGTTCCGGC AGCGATGATC 
GTTTTTGACA AGGTAGTTGC TTCCAATCCT GATTTTAAGA AGTTCTGGTT GTTGCACAGT 
ATTGAGCAGC CTGAAATAAA GGGGAATCAG ATTACCATAA AACGTACAAA AAACGGTGAT 
AGTGGGATGT TGGTGAATAC GGCTTTGCTG CCGGATGCGG CCAATTCAAA CATTACCTCC 
ATTGGCGGCA AGGGCAAAGA CTTCTGGGTG TTTGGTACCA ATTATACCAA TGATCCTAAA 
CCGGGCACGG ATGAAGCATT GGAACGTGGA GAATGGCGTG TGGAAATCAC TCCAAAAAAG 
GCAGCAGCCG AAGATTACTA CCTGAATGTG ATACAGATTG CCGACAATAC ACAGCAAAAA 
TTACACGAGG TGAAGCGTAT TGACGGTGAC AAGGTTGTTG GTGTGCAGCT TGCTGACAGG 
ATAGTTACTT TTAGCAAAAC TTCAGAAACT GTTGATCGTC CCTTTGGCTT TTCCGTTGTT 
GGTAAAGGAA CATTCAAATT TGTGATGACC GATCTTTTAG CGGGTACCTG GCAGGTGCTG 
AAAGACGGAA AAATACTTTA TCCTGCGCTT TCTGCAAAAG GTGATGATGG ACCCCTTTAT 
TTTGAAGGAA CTGAAGGAAC CTACCGTTTT TTGAGATAA 
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MKRQLYLYVT FWVELWVFT TKGYS QTKAD "WWKDVDGVS MPIPPKTHPR LYLREQQVPD 
KPADIP EVKDFR 

LKNRMNDPKL KKWADMIKM QEDWKPADIP EVKDFRFYFN QKGLTVRVEL MALNYLMTKD 
PEPTIDE 2B 

PKVGREA1TS 1 1DTLETATF KPAGDISRGI GLFMVTGAIV YDWCYDQLKP EEKTRFVKAF 

EFPEMYNLA AGR 

VRLAKMLECG YPPVKOKSIV GHASEWMIMR DLLSVGIAIY DEFPEMYNLA AGRFFKEHLV 

PEPTIDE 2A 

ARNWFYPSHN YHQGMSYLNV RFTNDLFALW ILDRMGAGNV FNPGQQFILY DA1YKRRPDG 

QILAGGDVDY SRKKPKYYTM PALLAGSYYK DEYLNYEFLK OPNVEPHCKL FEFLWRDTQL 

GSRKPDDLPL SRYSGSPFGW MIARTGW5PE SVIAEMKVNE YSFLNHQHQD AGAFQIYYKG 

PLAIDAGSYT GSSGGYNSPH NKNFFKRTIA HNSLLIYOPK ETFSSSGYGG SDHTOFAAND 

L AGDFVTGKIL AQGFGPONQT PDYTYL 
GGQRLPGKGW IAPRDLKEML AGDFRTGKIL AQGFGPONQT PDYTYLKGDI TAAYSAKVKE 
PEPTIDE 2C 

VKRSFLFLNL KDAKVPAAMI VFDKWASNP OFKKFWLLHS 1EQPEIKGNQ ITIKRTKNGD 
SGMLVNTALL POAANSNITS IGGKGKDFWV FGTNYTNDPK PGTDEALERG BVRVETTPKK 
AAAEDYYLNV IQIAONTQQK LHEVKRIDGD KWGVQLADR IVTFSKTSET VDRPFGFSW 
GKGTFKFVMT DLLAGTWQVL KDGKILYPAL SAKGDDGPLY FEGTEGTYRF LR 
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ATGACTACGA AAATTTTTM AAGGATCATT GTATTTRnr taattpp/v«t 


50 


AICGKEGGA -AAIATACITG-GAGAMfeeTG imTWifniirm 


100 


TTGACCACAT CAACCTTGAG TATTCCGGAC TGGAAaapp t taataaa^* 
w ■ " 1 * wwu/u, i bu/wv\bu 1 1 AATAAAGCA 


150 


GTTGCTGCCG GCAACTATGA CGATGCGGCC AAACrAmr t^ata™ 

W "'WWW/ nWWjWM luoCATACTA 


200 


CAGGGAAAAA AGTAAGGCCA GGGAACCTGA TTTPAPTAat mrA/mm 

™»i«n wBmw/iun iMLflUIAAl UlrAuAAAAGC 


250 


CTGCGGATAT ACGCCAGCCC ATAGATAAGG TTArrppTPA aatyvvw^ 

xwwwwb n i nun I nnOU 1 IflUjLb IbA AATGGCCGAC 


300 


AAGGCTTTGG TCCACCAGTT TCAACCGCAP AAArrrTArr p^tatttw. 

w 1 «^""v*u»u«u /v\Abbi#iAU> GCTATTTTGA 


350 


TTATGGTAAA GACATCAACT GGCAGATGTG GrrrrTAAAA papaa™*** 

..wru-ivi UW/nVAIOIb OUAJuIAAAA GACAATGAAG 


400 


TACGCTGGCA GTTGCACCGT GTAAAATGGT rccLrrn at ^ma«tt 
ww " uinrtrtftiboi bbbAbbCTAT GGCCCTGGTT 


450 


TATCACGCTA CGGGCGATGA AAAATATCrA apapaatyyy» tatat*»«t. 

™ wniun nnwiMlliUA AbAbAATGGG TATATCAGTA 


500 


CAGCGATTGG GCCAGAAAAA ACCCATTRfiG rrTrTrrw AiTuTn.r, 
« vwnumnm Muuww Hjbu lUGTCGCAG GATAATGATA 


550 


AATTTGTGTG GCGGCCCCTT GAAGTGTrrr ^pap-ppta™ «iatatt*« 
w wwowuui i owftjiuiLbb AuAGGGTACA AAGTCTTCCC 


600 


CCAACCTTCA GCTTATTTGT AAACTCGPpa ppptttappp pa/vwpt W 
1 * ,u 1 "«rti'iwLUA bbu I TACCC CAGCCTTTTT 


650 


AATGGAATTT TTAAACAGTT APCAPPAapa /ywymttat 

■ innnunuii muwiulaauA GGCCGATTAT TTATCTACGC 


700 


ATTATGCCGA ACAGGGAAAC CACCGITTAT TTPaappppa AnnA^m* 

vwwnnnw vnvAAJI 1 IAI 1 IbnAbbLCA ACGCAACTTG 


750 


TTTGCAGGGG TATCTTTCCC TGAATTTAAA GATTCACCAA GATGGAGGCA 


800 


AACCGGCATA TCGGTGCTGA ACACCGAGAT CAAAAAACAG GTTTATGCCG 


850 


ATGGGATGCA GTTTGAACTT TCACCAATTT ACCATGTAGC TGCCATCGAT 


900 


ATCTTCTTAA AGGCCTATGG TTCTGCAAAA CGAGTTAACC TTGAAAAAGA 


950 


ATTTCCGCAA TCTTATGTAC AAACTGTAGA AAATATGATT ATGGCGCTGA 


1000 
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TCAGTATTTC ACTGCCAGAT TATAACACCC CTATGTTTGG AGATTCATGG 1050 

ATTACAGATA AAAATTTCAG GATGGCACAG TTTGCCAGCT GGGCCCGGGT 1100 

TTTCCCGGCA AACCAGGCCA TAAAATATTT TGCTACAGAT GGCAAACAAG 1150 

GTAAGGCGCC TAACTTTTTA TCCAAAGCAT TGAGCAATGC AGGCTTTTAT 1200 

ACGTTTAGAA GCGGATGGGA TAAAAATGCA ACCGTTATGG TATTAAAAGC 1250 

CAGTCCTCCC GGAGAATTTC ATGCCCAGCC GGATAACGGG ACTTTTGAAC 1300 

TTTTTATAAA GGGCAGAAAC TTTACCCCAG ACGCCGGGGT ATTTGTGTAT 1350 

AGCGGCGACG AAGCCATCAT GAAACTGCGG AACTGGTACC GTCAAACCCG 1400 

CATACACAGC ACGCTTACAC TCGACAATCA AAATATGGTC ATTACCAAAG 1450 

CCCGGCAAAA CAAATGGGAA ACAGGAAATA ACCTTGATGT GCTTACCTAT 1500 

ACCAACCCAA GCTATCCGAA TCTGGACCAT CAGCGCAGTG TACTTTTCAT 1550 

CAACAAAAAA TACTTTCTGG TCATCGATAG GGCAATAGGC GAAGCTACCG 1600 

GAAACCTGGG CGTACACTGG CAGCTTAAAG AAGACAGCAA CCCTGTTTTC 1650 

GATAAGACAA AGAACCGGGT TTACACCACT TACAGAGATG GTAACAACCT 1700 

GATGATCCAA TCGTTGAATG CGGACAGGAC CAGCCTCAAT GAAGAAGAAG 1750 

GAAAGGTATC TTATGTTTAC AATAAGGAGC TGAAAAGACC TGCTTTCGTA 1800 

TTTGAAAAGC CTAAAAAGAA TGCCGGCACA CAAAATTTTG TCAGTATAGT 1850 

TTATCCATAC GACGGCCAGA AGGCTCCAGA GATCAGCATA CGGGAAAACA 1900 

AGGGCAATGA TTTTGAGAAA GGCAAGCTTA ATCTAACCCT TACCATTAAC 1950 

GGAAAACAAC AGCTTGTGTT GGTTCCTTAG 1980 
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MTTK1FKR1I VFAVIALSSG NILAQSSSfT RKDFDHINLE YSGLEKVNKA VMGNYDDAA 

KAL VHNFV/PH KGYGYFDYGK 

KALLAYYREK SKAREPDFSN AEKPADIRQP IDKVTREMAD KALVHQFQPH KGYGYFDYGK 

PEPTIDE X 

DIN UK -NEVRWLHR VK 

DINWQMWPVK ONEVRWQIHR VKWQAMALV YHATGDEKYA REWVYQYSDW ARKNPLGLSQ 
PEPTIDE JA 

DNDKFVWRPL EVSORVQSLP PTFSLFVNSP AFTPAFLMEF LNSYHQQADY LSTHYAEQGN 
HRLFEAQRNL FAGVSFPEFK DSPRWRQTGI SVLNTE IKKQ VYADGMQFEL SPIYHVAAID 



IFLKAYGSAK RVNLEKEFPQ SYVQTVENMI MALISISLPD YNTPMFGDSW ITDKNFRMAQ 

VLKASPP 

FASIVARVFPA NQAIKYFATO GKQGKAPNFL SKALSNAGFY TFRSGWDKNA TVMVLKASPP 
GEFHAQPDNG TFELFI 

SKSK TFELFIKGRN FTPDAGVFVY SGDEAIMKLR NWYRQTRIHS TLTLDNQNMV 

rtrllDc Jo 

I TKARQNKWE TGNNLDVLTY TNPSYPNLDH QRSVLFINKK YFLVIDRAIG EATGNLGVHW 
OLKEDSNPVF DKTKNRVYTT YRDGNNLMIQ SLNAORTSLN EEEGKVSYVY NKELKRPAFV 
FEKPKKNAGT QNFVSIVYPY DGQKAPEISI RENKGNDFEK GKLNLTLTIN GKQQLVLVP 



FIG.9 
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